Python多处理需要更多时间 [英] Python multiprocessing takes more time
问题描述
我有12核和28GB RAM的服务器.我正在运行两个版本的Python.一个具有多重处理功能,另一个具有顺序功能.与Sequential.py相比,我希望Multiprocessing.py能够更早完成,但是与顺序代码(25sec)相比,多处理代码(120sec)要花5倍的时间
I have server with 12 cores and 28GB RAM. I am running two versions of Python; one with multiprocessing and another sequential. I expect the Multiprocessing.py to finish early compared to Sequential.py but the multiprocessing code takes 5 times more (120sec) compared to sequential code (25sec)
Multiprocessing.py
Multiprocessing.py
import os,multiprocessing,time
def cube(x):
print(x**3)
return
if __name__ == '__main__':
jobs = []
start = time.time()
for i in range(5000):
p = multiprocessing.Process(target=cube(i))
jobs.append(p)
p.start()
end = time.time()
print end - start
Sequential.py
Sequential.py
import os,time
def cube(x):
print(x**3)
return
if __name__ == '__main__':
start = time.time()
for i in range(5000):
cube(i)
end = time.time()
print end - start
可以请您帮忙吗?
推荐答案
问题是相对于IPC通信开销而言,要做的工作太少了.
The problem is that too little work is being done relative to the IPC communication overhead.
cube 函数不是多处理加速的理想选择.尝试一些更有趣"的函数,例如计算1到n的立方和的函数或类似的函数:
The cube function isn't a good candidate for multiprocessing speedup. Try something "more interesting" like function that computes the sum of cube for 1 to n or somesuch:
import os, multiprocessing, time
def sum_of_cubes(n):
return sum(x**3 for x in range(n))
if __name__ == '__main__':
from multiprocessing.pool import ThreadPool as Pool
pool = Pool(25)
start = time.time()
print(pool.map(sum_of_cubes, range(1000, 100000, 1000)))
end = time.time()
print(end - start)
一般规则是:
- 启动的池不要超过您的核心可以受益的
- 不要传递大量数据或返回大量数据(IPC负载过多)
- 相对于IPC开销,在此过程中做了大量工作.
这篇关于Python多处理需要更多时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!