Joblib简单示例并行示例比简单慢 [英] Joblib simple example parallel example slower than simple
问题描述
from math import sqrt
from joblib import Parallel, delayed
import time
if __name__ == '__main__':
st= time.time()
#[sqrt(i ** 2) for i in range(100000)] #this part in non parellel
Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(100000))
print time.time()-st
现在非parelle部分的运行时间为0.4秒,而并行部分的运行时间为18秒..我很困惑为什么会发生
now the non parelle part runs in 0.4 sec while parallel part runs for 18 sec .. I am confused why would this happen
推荐答案
并行进程(由joblib
创建)需要复制数据.想象一下:您有两个人,每个人都将一块岩石搬到他们的房屋,照亮它,然后把它拿回来.这比一个人在现场照亮他们的速度慢.
Parallel processes (which joblib
creates) require copying data. Imagine it this way: you have two people who each carry a rock to their house, shine it, then bring it back. That's loads slower than one person shining them on-the-spot.
所有时间都浪费在运输中,而不是花费在实际计算上.您只会从并行处理中受益,才能执行更多实质性的计算任务.
All the time is wasted in transit, rather than being spent on the actual calculation. You will only benefit from parallel processes for more substantial computational tasks.
如果您想加快此特定操作的速度,请执行以下操作:
使用numpy
的矢量化数学运算.在我的机器上,并行:1.13 s,串行:54.6 ms,numpy:3.74 ms.
If you care about speeding up this specific operation:
Use numpy
's vectorized math operations. On my machine, parallel: 1.13 s, serial: 54.6 ms, numpy: 3.74 ms.
a = np.arange(100000, dtype=np.int)
np.sqrt(a ** 2)
不必担心Cython或Numba之类的库;他们不会加快已经执行的操作的速度.
Don't worry about libraries like Cython or Numba; they won't speed up this already performant operation.
这篇关于Joblib简单示例并行示例比简单慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!