Linux中的Python线程与多处理 [英] Python threading vs. multiprocessing in Linux

查看:525
本文介绍了Linux中的Python线程与多处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基于此问题,我假设创建了新流程应该与在Linux中创建新线程几乎一样快.但是,很少有测试显示出截然不同的结果.这是我的代码:

Based on this question I assumed that creating new process should be almost as fast as creating new thread in Linux. However, little test showed very different result. Here's my code:

from multiprocessing import Process, Pool
from threading import Thread

times = 1000

def inc(a):
    b = 1
    return a + b

def processes():
    for i in xrange(times):
        p = Process(target=inc, args=(i, ))
        p.start()
        p.join()

def threads():
    for i in xrange(times):
        t = Thread(target=inc, args=(i, ))
        t.start()
        t.join()

测试:

>>> timeit processes() 
1 loops, best of 3: 3.8 s per loop

>>> timeit threads() 
10 loops, best of 3: 98.6 ms per loop

因此,创建流程的速度几乎快 40倍!为什么会发生?是特定于Python还是这些库?还是我只是误解了上面的答案?

So, processes are almost 40 times slower to create! Why does it happen? Is it specific to Python or these libraries? Or did I just misinterpreted the answer above?

UPD 1..我知道这段代码实际上并没有引入任何并发性.这里的目标是测试创建进程和线程所需的时间.要在Python中使用真正的并发,可以使用如下代码:

UPD 1. To make it more clear. I understand that this piece of code doesn't actually introduce any concurrency. The goal here is to test the time needed to create a process and a thread. To use real concurrency with Python one can use something like this:

def pools():
    pool = Pool(10)
    pool.map(inc, xrange(times))

的运行速度实际上比线程版本要快得多.

which really runs much faster than threaded version.

UPD 2.我添加了os.fork()版本:

for i in xrange(times):
    child_pid = os.fork()
    if child_pid:
        os.waitpid(child_pid, 0)
    else:
        exit(-1)

结果是:

$ time python test_fork.py 

real    0m3.919s
user    0m0.040s
sys     0m0.208s

$ time python test_multiprocessing.py 

real    0m1.088s
user    0m0.128s
sys     0m0.292s

$ time python test_threadings.py

real    0m0.134s
user    0m0.112s
sys     0m0.048s

推荐答案

您链接到的问题是比较仅调用fork(2)pthread_create(3)的开销,而您的代码执行了更多操作,例如使用join()等待进程/线程终止.

The question you linked to is comparing the cost of just calling fork(2) vs. pthread_create(3), whereas your code does quite a bit more, e.g. using join() to wait for the processes/threads to terminate.

如果您说的话...

这里的目标是测试创建进程和线程所需的时间.

The goal here is to test the time needed to create a process and a thread.

...那么您不应该等待它们完成.您应该使用更像这样的测试程序...

...then you shouldn't be waiting for them to complete. You should be using test programs more like these...

fork.py

import os
import time

def main():
    for i in range(100):
        pid = os.fork()
        if pid:
            #print 'created new process %d' % pid
            continue
        else:
            time.sleep(1)
            return

if __name__ == '__main__':
    main()

thread.py

import thread
import time

def dummy():
    time.sleep(1)

def main():
    for i in range(100):
        tid = thread.start_new_thread(dummy, ())
        #print 'created new thread %d' % tid

if __name__ == '__main__':
    main()

...给出以下结果...

...which give the following results...

$ time python fork.py
real    0m0.035s
user    0m0.008s
sys     0m0.024s

$ time python thread.py
real    0m0.032s
user    0m0.012s
sys     0m0.024s

...因此线程和进程的创建时间没有太大差异.

...so there's not much difference in the creation time of threads and processes.

这篇关于Linux中的Python线程与多处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆