Python:(Pathos)多处理与类方法 [英] Python: (Pathos) Multiprocessing vs. class methods
问题描述
我正在尝试通过多处理使用类方法并行化代码.基本结构如下:
I am trying to parallelize a code using class methods via multiprocessing. The basic structure is the following:
# from multiprocessing import Pool
from pathos.multiprocessing import ProcessingPool as Pool
class myclass(object):
def __init__(self):
#some code
def mymethod(self):
#more code
return another_instance_of_myclass
def myfunc(myinstance,args):
#some code
test=myinstance.mymethod()
#more code
return myresult #not an instance,just a number
p=Pool()
result = p.map(myfunc,listwithdata)
在正常的多处理失败之后,我意识到了Pickle和Multiprocessing的问题,因此我尝试使用multiprocessing.pathos解决它.但是,我仍然得到
After this had failed with the normal multiprocessing, I became aware of the issues with Pickle and Multiprocessing, so I tried to solve it with multiprocessing.pathos. However, I am still getting
PicklingError: Can't pickle <type 'SwigPyObject'>: it's not found as__builtin__.SwigPyObjec
连同pickle.py的许多错误.除了这个实际问题之外,我还不太明白为什么除了myfunc的最终结果之外什么都不会被腌制.
together with lots of errors from pickle.py. Apart from this practical problem, I don't quite understand why anything but the final result of myfunc is being pickled at all.
推荐答案
pathos
使用dill
,并且dill
对类的序列化与python的pickle
模块不同. pickle
通过引用序列化类. dill
(默认情况下)直接对类进行序列化,并且只能(可选)通过引用进行序列化.
pathos
uses dill
, and dill
serializes classes differently than python's pickle
module does. pickle
serializes classes by reference. dill
(by default) serializes classes directly, and only optionally by reference.
>>> import dill
>>>
>>> class Foo(object):
... def __init__(self, x):
... self.x = x
... def bar(self, y):
... return self.x + y * z
... z = 1
...
>>> f = Foo(2)
>>>
>>> dill.dumps(f) # the dill default, explicitly serialize a class
'\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x03Fooq\x05h\x01U\nObjectTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\r__slotnames__q\x0b]q\x0cU\n__module__q\rU\x08__main__q\x0eU\x03barq\x0fcdill.dill\n_create_function\nq\x10(cdill.dill\n_unmarshal\nq\x11Uyc\x02\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s\x0f\x00\x00\x00|\x00\x00j\x00\x00|\x01\x00t\x01\x00\x14\x17S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x01\x00\x00\x00xt\x01\x00\x00\x00z(\x02\x00\x00\x00t\x04\x00\x00\x00selft\x01\x00\x00\x00y(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x03\x00\x00\x00bar\x04\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x12\x85q\x13Rq\x14c__builtin__\n__main__\nh\x0fNN}q\x15tq\x16Rq\x17U\x01zq\x18K\x01U\x07__doc__q\x19NU\x08__init__q\x1ah\x10(h\x11Uuc\x02\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00|\x01\x00|\x00\x00_\x00\x00d\x00\x00S(\x01\x00\x00\x00N(\x01\x00\x00\x00t\x01\x00\x00\x00x(\x02\x00\x00\x00t\x04\x00\x00\x00selfR\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__init__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x1b\x85q\x1cRq\x1dc__builtin__\n__main__\nh\x1aNN}q\x1etq\x1fRq utq!Rq")\x81q#}q$U\x01xq%K\x02sb.'
>>> dill.dumps(f, byref=True) # the pickle default, serialize by reference
'\x80\x02c__main__\nFoo\nq\x00)\x81q\x01}q\x02U\x01xq\x03K\x02sb.'
不通过引用进行序列化更加灵活.但是,在极少数情况下,使用引用会更好(因为对在SwigPyObject
上构建的内容进行腌制似乎是这种情况).
Not serializing by reference is much more flexible. However, in rare circumstances, working with references is better (as it appears to be the case when pickling something built on a SwigPyObject
).
我已经(大约2年)打算将byref
标志公开给pathos
内部的dump
调用了,但是还没有这样做.这样做应该是相当简单的编辑.我刚刚添加了一张票来这样做: https://github.com/uqfoundation/pathos/第58期.在我看来,打开pathos
使用的dump
和load
函数的替换也应该很容易...这样,您就可以使用自定义的序列化程序(即扩展dill
提供的序列化程序,或者使用其他一些序列化程序.
I have been meaning (for ~2 years) to expose the byref
flag to the dump
call inside of pathos
, but have not done so yet. It should be a fairly simple edit to do so. I've just added a ticket to do so: https://github.com/uqfoundation/pathos/issues/58. While I'm at it, it should also be easy to open up replacement of the dump
and load
functions that pathos
uses… that way you could use customized serializers (i.e. extend those that dill
provides, or use some other serializer).
这篇关于Python:(Pathos)多处理与类方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!