如何并行化一个简单的 Python 循环? [英] How do I parallelize a simple Python loop?
问题描述
这可能是一个微不足道的问题,但如何在 python 中并行化以下循环?
This is probably a trivial question, but how do I parallelize the following loop in python?
# setup output lists
output1 = list()
output2 = list()
output3 = list()
for j in range(0, 10):
# calc individual parameter value
parameter = j * offset
# call the calculation
out1, out2, out3 = calc_stuff(parameter = parameter)
# put results into correct output list
output1.append(out1)
output2.append(out2)
output3.append(out3)
我知道如何在 Python 中启动单线程,但我不知道如何收集"结果.
I know how to start single threads in Python but I don't know how to "collect" the results.
多个进程也可以 - 在这种情况下最简单的方法.我目前使用的是 Linux,但代码应该也可以在 Windows 和 Mac 上运行.
Multiple processes would be fine too - whatever is easiest for this case. I'm using currently Linux but the code should run on Windows and Mac as-well.
并行化此代码的最简单方法是什么?
What's the easiest way to parallelize this code?
推荐答案
由于全局解释器锁 (GIL),在 CPython 上使用多个线程不会为纯 Python 代码提供更好的性能.我建议改用 multiprocessing
模块:
Using multiple threads on CPython won't give you better performance for pure-Python code due to the global interpreter lock (GIL). I suggest using the multiprocessing
module instead:
pool = multiprocessing.Pool(4)
out1, out2, out3 = zip(*pool.map(calc_stuff, range(0, 10 * offset, offset)))
请注意,这在交互式解释器中不起作用.
Note that this won't work in the interactive interpreter.
为了避免围绕 GIL 的常见 FUD:无论如何,在此示例中使用线程没有任何优势.您希望在这里使用进程,而不是线程,因为它们可以避免一大堆问题.
To avoid the usual FUD around the GIL: There wouldn't be any advantage to using threads for this example anyway. You want to use processes here, not threads, because they avoid a whole bunch of problems.
这篇关于如何并行化一个简单的 Python 循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!