多处理代码如何腌制地图功能? [英] How does multiprocessing code pickle the map function?

查看:55
本文介绍了多处理代码如何腌制地图功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个网格搜索实用程序,并尝试使用多处理来加速计算.我有一个目标函数,它与一个大类交互,由于内存限制,我无法对其进行pickle(我只能pickle 类的相关属性).

I am writing a grid searching utility and am trying to use multiprocessing to speed up calculation. I have an objective function which interacts with a large class which I cannot pickle due to memory constraints (I can only pickle relevant attributes of the class).

import pickle
from multiprocessing import Pool


class TestClass:
    def __init__(self):
        self.param = 10

    def __getstate__(self):
        raise RuntimeError("don't you dare pickle me!")

    def __setstate__(self, state):
        raise RuntimeError("don't you dare pickle me!")

    def loss(self, ext_param):
        return self.param*ext_param


if __name__ == '__main__':
    test_instance = TestClass()

    def objective_function(param):
        return test_instance.loss(param)

    with Pool(4) as p:
        result = p.map(objective_function, range(20))
    print(result)

在下面的玩具示例中,我期望在目标函数的酸洗期间,该 test_instance 也必须被酸洗,从而引发 RuntimeError(由于在 __getstate__ 处抛出异常).然而,这不会发生,代码运行顺利.

In the following toy example, I was expecting during pickling of the objective_function, that test_instance would also have to be pickled, thus throwing RuntimeError (due to exception throwing at __getstate__). However this does not happen and the code runs smoothly.

所以我的问题是 - 这里究竟腌制了什么?如果 test_instance 没有被pickle,那么它是如何在单个进程上重构的?

So my question is - what is being pickled here exactly? And if test_instance is not pickled, then how is it reconstructed on individual processes?

推荐答案

好的,在 Wilson 的帮助和进一步挖掘下,我已经设法回答了我自己的问题.我将插入上面修改后的代码以帮助解释:

Ok, with Wilson's help and some further digging, I've managed to answer my own question. I'll insert the modified code from above to help with explanation:

import pickle
from multiprocessing import Pool, current_process


class TestClass:
    def __init__(self):
        self.param = 0

    def __getstate__(self):
        raise RuntimeError("don't you dare pickle me!")

    def __setstate__(self, state):
        raise RuntimeError("don't you dare pickle me!")

    def loss(self, ext_param):
        self.param += 1
        print(f"{current_process().pid}: {hex(id(self))}:  {self.param}: {ext_param} ")
        return f"{self.param}_{ext_param}"


def objective_function(param):
    return test_instance.loss(param)

if __name__ == '__main__':

    test_instance = TestClass()
    print(hex(id(test_instance)))
    print('objective_function' in globals())  # this returns True on my MacOS+python3.7

    with Pool(2) as p:
        result = p.map(objective_function, range(6))

    print(result)
    print(test_instance.param)

# ---- RUN RESULTS BELOW ----
# 0x7f987b955e48
# True
# 10484: 0x7f987b955e48:  1: 0 
# 10485: 0x7f987b955e48:  1: 1 
# 10484: 0x7f987b955e48:  2: 2 
# 10485: 0x7f987b955e48:  2: 3 
# 10484: 0x7f987b955e48:  3: 4 
# 10485: 0x7f987b955e48:  3: 5 
# ['1_0', '1_1', '2_2', '2_3', '3_4', '3_5']
# 0

正如 Wilson 正确暗示的那样,在 p.map 期间唯一被腌制的是参数本身而不是目标函数,但是这不是重新初始化而是复制,以及 os.fork() 期间的 test_instance 进程发生在 Pool 初始化的某个地方.您可以看到,尽管在每个进程内部,test_instance.param 值彼此独立,但它们与 fork 之前类的原始实例共享相同的虚拟内存(可以在此处查看共享相同虚拟内存的不同进程的示例).

As Wilson has correctly hinted, the only thing that gets pickled during p.map are the parameters themselves and not the objective function, however this is not reinitialised but copied, along with the test_instance during os.fork() process which happens somewhere in the Pool initialisation. You can see that as even though inside each process the test_instance.param values are independent of each other they share the same virtual memory as the original instance of the class before the fork (example of different processes sharing the same virtual memory can be seen here).

根据最初问题的解决方案,我认为正确解决此问题的唯一方法是将必要的参数分配到共享内存或内存管理器中.

As per the solution to the initial question, I believe the only way to properly solve this issue is to distribute the necessary parameters in shared memory or memory manager.

这篇关于多处理代码如何腌制地图功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆