在Python多处理程序库中替换pickle [英] Replace pickle in Python multiprocessing lib
问题描述
我需要执行以下代码(Python 3.5中实际代码库的简化版本):
import multiprocessing
def forever(do_something=None):
while True:
do_something()
p = multiprocessing.Process(target=forever, args=(lambda: print("do something"),))
p.start()
为了创建新进程,Python需要对函数进行腌制,并将lambda作为目标传递. 不幸的是,泡菜无法序列化lambda,并且输出是这样的:
_pickle.PicklingError: Can't pickle <function <lambda> at 0x00C0D4B0>: attribute lookup <lambda> on __main__ failed
我发现 cloudpickle 可以使用相同的pickle接口对lambda和闭包进行序列化和反序列化. /p>
如何强制Python多处理模块使用cloudpickle而不是pickle?
显然无法破解标准库多处理程序的代码!
谢谢
查理
尝试multiprocess
.这是multiprocessing
的分叉,它使用dill
序列化器而不是pickle
–分叉中没有其他更改.
我是作者.几年前,我遇到了与您相同的问题,最终我决定入侵标准库是我唯一的选择,因为multiprocessing
中的某些pickle
代码是C ++.
>>> import multiprocess as mp
>>> p = mp.Pool()
>>> p.map(lambda x:x**2, range(4))
[0, 1, 4, 9]
>>>
I need to execute the code below (simplified version of my real code base in Python 3.5):
import multiprocessing
def forever(do_something=None):
while True:
do_something()
p = multiprocessing.Process(target=forever, args=(lambda: print("do something"),))
p.start()
In order to create the new process Python need to pickle the function and the lambda passed as target. Unofrtunately pickle cannot serialize lambdas and the output is like this:
_pickle.PicklingError: Can't pickle <function <lambda> at 0x00C0D4B0>: attribute lookup <lambda> on __main__ failed
I discoverd cloudpickle which can serialize and deserialize lambdas and closures, using the same interface of pickle.
How can I force the Python multiprocessing module to use cloudpickle instead of pickle?
Clearly hacking the code of the standard lib multiprocessing is not an option!
Thanks
Charlie
Try multiprocess
. It's a fork of multiprocessing
that uses the dill
serializer instead of pickle
-- there are no other changes in the fork.
I'm the author. I encountered the same problem as you several years ago, and ultimately I decided that that hacking the standard library was my only choice, as some of the pickle
code in multiprocessing
is in C++.
>>> import multiprocess as mp
>>> p = mp.Pool()
>>> p.map(lambda x:x**2, range(4))
[0, 1, 4, 9]
>>>
这篇关于在Python多处理程序库中替换pickle的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!