为什么穿线会增加处理时间? [英] Why threading increase processing time?

查看:74
本文介绍了为什么穿线会增加处理时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对基本的2-D DLA模拟进行多任务处理.扩散受限聚合(DLA)是当您的粒子执行随机游走并在它们接触当前的聚合时进行聚合.

I was working on multitasking a basic 2-D DLA simulation. Diffusion Limited Aggregation (DLA) is when you have particles performing a random walk and aggregate when they touch the current aggregate.

在模拟中,我每一步都有10.000个粒子向随机方向行走.我使用一个工作人员池和一个队列来喂养他们.我给它们提供了一个粒子列表,然后工作人员对每个粒子执行方法.updatePositionAndggregate().

In the simulation, I have 10.000 particles walking to a random direction at each step. I use a pool of worker and a queue to feed them. I feed them with a list of particles and the worker perform the method .updatePositionAndggregate() on each particle.

如果我有一个工人,则为它提供10.000个粒子的列表,如果我有两个工人,则为它们分别提供一个5.000个粒子的列表,如果我有3个工人,则为它们提供3.333个列表.等等.

If I have one worker, I feed it with a list of 10.000 particles, if I have two workers, i feed them with a list of 5.000 particles each, if I have 3 workers, I feed them with a list of 3.333 particles each, etc and etc.

我现在为您显示一些代码给员工

I show you some code for the worker now

class Worker(Thread):
    """
    The worker class is here to process a list of particles and try to aggregate
    them.
    """

    def __init__(self, name, particles):
        """
        Initialize the worker and its events.
        """
        Thread.__init__(self, name = name)
        self.daemon = True
        self.particles = particles
        self.start()

    def run(self):
        """
        The worker is started just after its creation and wait to be feed with a
        list of particles in order to process them.
        """

        while True:

            particles = self.particles.get()
            # print self.name + ': wake up with ' + str(len(self.particles)) + ' particles' + '\n'

            # Processing the particles that has been feed.
            for particle in particles:
                particle.updatePositionAndAggregate()

            self.particles.task_done()
            # print self.name + ': is done' + '\n'

在主线程中:

# Create the workers.
workerQueue = Queue(num_threads)
for i in range(0, num_threads):
    Worker("worker_" + str(i), workerQueue)

# We run the simulation until all the particle has been created
while some_condition():

    # Feed all the workers.
    startWorker = datetime.datetime.now()
    for i in range(0, num_threads):
        j = i * len(particles) / num_threads
        k = (i + 1) * len(particles) / num_threads

        # Feeding the worker thread.
        # print "main: feeding " + worker.name + ' ' + str(len(worker.particles)) + ' particles\n'
        workerQueue.put(particles[j:k])


    # Wait for all the workers
    workerQueue.join()

    workerDurations.append((datetime.datetime.now() - startWorker).total_seconds())
    print sum(workerDurations) / len(workerDurations)

因此,我打印了等待工人终止任务的平均时间.我用不同的线程号做了一些实验.

So, I print the average time in waiting the workers to terminate their tasks. I did some experiment with different thread number.

| num threads | average workers duration (s.) |
|-------------|-------------------------------|
| 1           | 0.147835636364                |
| 2           | 0.228585818182                |
| 3           | 0.258296454545                |
| 10          | 0.294294636364                |

我真的很想知道为什么增加工人会增加加工时间,我认为至少有2个工人会减少加工时间,但是从0.14s显着增加了.至0.23s.你能解释一下为什么吗?

I really wonder why adding workers increase the processing time, I thought that at least having 2 worker would decrease the processing time, but it dramatically increases from .14s. to 0.23s. Can you explain me why ?

因此,解释是Python线程实现,有没有一种方法可以使我真正进行多任务处理?

So, explanation is Python threading implementation, is there a way so I can have real multitasking ?

推荐答案

之所以会发生这种情况,是因为线程不能同时执行 ,因为由于GIL,Python一次只能执行一个线程(全局解释器锁定).

This is happening because threads don't execute at the same time as Python can execute only one thread at a time due to GIL (Global Interpreter Lock).

生成新线程时,除此线程外,所有内容均冻结.当它停止时,另一个将被执行.生成线程需要很多时间.

When you spawn a new thread, everything freezes except this thread. When it stops the other one is executed. Spawning threads needs lots of time.

友好地说,代码根本没有关系,因为在Python中使用100个线程的代码比使用10个线程的代码更慢(如果更多的线程意味着更高的效率和速度,那不是始终为真).

Friendly speaking, the code doesn't matter at all as any code using 100 threads is SLOWER than code using 10 threads in Python (if more threads means more efficiency and more speed, which is not always true).

以下是 Python文档的确切报价:

CPython实现细节:

在CPython中,由于使用了全局解释器锁,因此只有一个线程可以一次执行Python代码(即使某些面向性能的库可能克服了此限制).如果希望您的应用程序更好地利用多核计算机的计算资源,建议使用multiprocessingconcurrent.futures.ProcessPoolExecutor.但是,如果您要同时运行多个I/O绑定任务,则线程化仍然是合适的模型.

In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing or concurrent.futures.ProcessPoolExecutor. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.

有关GIL的维基百科

有关GIL的StackOverflow

这篇关于为什么穿线会增加处理时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆