优化多处理.使用昂贵的初始化池 [英] Optimizing multiprocessing.Pool with expensive initialization
问题描述
这是一个完整的简单工作示例
Here is a complete simple working example
import multiprocessing as mp
import time
import random
class Foo:
def __init__(self):
# some expensive set up function in the real code
self.x = 2
print('initializing')
def run(self, y):
time.sleep(random.random() / 10.)
return self.x + y
def f(y):
foo = Foo()
return foo.run(y)
def main():
pool = mp.Pool(4)
for result in pool.map(f, range(10)):
print(result)
pool.close()
pool.join()
if __name__ == '__main__':
main()
我该如何修改它,以便Foo只能由每个工作人员(而不是每个任务)初始化一次?基本上,我希望将init调用4次,而不是10次.我正在使用python 3.5
How can I modify it so Foo is only initialized once by each worker, not every task? Basically I want the init called 4 times, not 10. I am using python 3.5
推荐答案
处理此类问题的预期方法是通过Pool()
构造函数的可选initializer
和initargs
参数.它们的存在正好为您提供了一种在创建工作者进程时一次仅执行一次操作的方法.因此,例如,添加:
The intended way to deal with things like this is via the optional initializer
and initargs
arguments to the Pool()
constructor. They exist precisely to give you a way to do stuff exactly once when a worker process is created. So, e.g., add:
def init():
global foo
foo = Foo()
,然后将Pool
创建内容更改为:
and change the Pool
creation to:
pool = mp.Pool(4, initializer=init)
如果您需要将参数传递给每个进程的初始化函数,则还需要添加一个适当的initargs=...
参数.
If you needed to pass arguments to your per-process initialization function, then you'd also add an appropriate initargs=...
argument.
注意:当然,您也应该删除
Note: of course you should also remove the
foo = Foo()
f()
行,以便您的函数使用由init()
创建的全局foo
.
line from f()
, so that your function uses the global foo
created by init()
.
这篇关于优化多处理.使用昂贵的初始化池的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!