优化多处理.使用昂贵的初始化池 [英] Optimizing multiprocessing.Pool with expensive initialization

查看:94
本文介绍了优化多处理.使用昂贵的初始化池的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个完整的简单工作示例

Here is a complete simple working example

import multiprocessing as mp
import time
import random


class Foo:
    def __init__(self):
        # some expensive set up function in the real code
        self.x = 2
        print('initializing')

    def run(self, y):
        time.sleep(random.random() / 10.)
        return self.x + y


def f(y):
    foo = Foo()
    return foo.run(y)


def main():
    pool = mp.Pool(4)
    for result in pool.map(f, range(10)):
        print(result)
    pool.close()
    pool.join()


if __name__ == '__main__':
    main()

我该如何修改它,以便Foo只能由每个工作人员(而不是每个任务)初始化一次?基本上,我希望将init调用4次,而不是10次.我正在使用python 3.5

How can I modify it so Foo is only initialized once by each worker, not every task? Basically I want the init called 4 times, not 10. I am using python 3.5

推荐答案

处理此类问题的预期方法是通过Pool()构造函数的可选initializerinitargs参数.它们的存在正好为您提供了一种在创建工作者进程时一次仅执行一次操作的方法.因此,例如,添加:

The intended way to deal with things like this is via the optional initializer and initargs arguments to the Pool() constructor. They exist precisely to give you a way to do stuff exactly once when a worker process is created. So, e.g., add:

def init():
    global foo
    foo = Foo()

,然后将Pool创建内容更改为:

and change the Pool creation to:

pool = mp.Pool(4, initializer=init)

如果您需要将参数传递给每个进程的初始化函数,则还需要添加一个适当的initargs=...参数.

If you needed to pass arguments to your per-process initialization function, then you'd also add an appropriate initargs=... argument.

注意:当然,您也应该删除

Note: of course you should also remove the

foo = Foo()

f()行,以便您的函数使用init()创建的全局foo.

line from f(), so that your function uses the global foo created by init().

这篇关于优化多处理.使用昂贵的初始化池的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆