Python:(Pathos)多处理与类方法 [英] Python: (Pathos) Multiprocessing vs. class methods

查看:454
本文介绍了Python:(Pathos)多处理与类方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过多处理使用类方法并行化代码.基本结构如下:

I am trying to parallelize a code using class methods via multiprocessing. The basic structure is the following:

# from multiprocessing import Pool
from pathos.multiprocessing import ProcessingPool as Pool

class myclass(object):
    def __init__(self):
        #some code
    def mymethod(self):
        #more code
        return another_instance_of_myclass



def myfunc(myinstance,args):
    #some code   
    test=myinstance.mymethod()
    #more code
    return myresult #not an instance,just a number

p=Pool()

result = p.map(myfunc,listwithdata)

在正常的多处理失败之后,我意识到了Pickle和Multiprocessing的问题,因此我尝试使用multiprocessing.pathos解决它.但是,我仍然得到

After this had failed with the normal multiprocessing, I became aware of the issues with Pickle and Multiprocessing, so I tried to solve it with multiprocessing.pathos. However, I am still getting

PicklingError: Can't pickle <type 'SwigPyObject'>: it's not found as__builtin__.SwigPyObjec

连同pickle.py的许多错误.除了这个实际问题之外,我还不太明白为什么除了myfunc的最终结果之外什么都不会被腌制.

together with lots of errors from pickle.py. Apart from this practical problem, I don't quite understand why anything but the final result of myfunc is being pickled at all.

推荐答案

pathos使用dill,并且dill对类的序列化与python的pickle模块不同. pickle通过引用序列化类. dill(默认情况下)直接对类进行序列化,并且只能(可选)通过引用进行序列化.

pathos uses dill, and dill serializes classes differently than python's pickle module does. pickle serializes classes by reference. dill (by default) serializes classes directly, and only optionally by reference.

>>> import dill
>>> 
>>> class Foo(object):
...   def __init__(self, x):
...     self.x = x
...   def bar(self, y):
...     return self.x + y * z
...   z = 1
... 
>>> f = Foo(2)
>>> 
>>> dill.dumps(f)  # the dill default, explicitly serialize a class
'\x80\x02cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01U\x08TypeTypeq\x02\x85q\x03Rq\x04U\x03Fooq\x05h\x01U\nObjectTypeq\x06\x85q\x07Rq\x08\x85q\t}q\n(U\r__slotnames__q\x0b]q\x0cU\n__module__q\rU\x08__main__q\x0eU\x03barq\x0fcdill.dill\n_create_function\nq\x10(cdill.dill\n_unmarshal\nq\x11Uyc\x02\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00C\x00\x00\x00s\x0f\x00\x00\x00|\x00\x00j\x00\x00|\x01\x00t\x01\x00\x14\x17S(\x01\x00\x00\x00N(\x02\x00\x00\x00t\x01\x00\x00\x00xt\x01\x00\x00\x00z(\x02\x00\x00\x00t\x04\x00\x00\x00selft\x01\x00\x00\x00y(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x03\x00\x00\x00bar\x04\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x12\x85q\x13Rq\x14c__builtin__\n__main__\nh\x0fNN}q\x15tq\x16Rq\x17U\x01zq\x18K\x01U\x07__doc__q\x19NU\x08__init__q\x1ah\x10(h\x11Uuc\x02\x00\x00\x00\x02\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\r\x00\x00\x00|\x01\x00|\x00\x00_\x00\x00d\x00\x00S(\x01\x00\x00\x00N(\x01\x00\x00\x00t\x01\x00\x00\x00x(\x02\x00\x00\x00t\x04\x00\x00\x00selfR\x00\x00\x00\x00(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00__init__\x02\x00\x00\x00s\x02\x00\x00\x00\x00\x01q\x1b\x85q\x1cRq\x1dc__builtin__\n__main__\nh\x1aNN}q\x1etq\x1fRq utq!Rq")\x81q#}q$U\x01xq%K\x02sb.'
>>> dill.dumps(f, byref=True)  # the pickle default, serialize by reference
'\x80\x02c__main__\nFoo\nq\x00)\x81q\x01}q\x02U\x01xq\x03K\x02sb.'

不通过引用进行序列化更加灵活.但是,在极少数情况下,使用引用会更好(因为对在SwigPyObject上构建的内容进行腌制似乎是这种情况).

Not serializing by reference is much more flexible. However, in rare circumstances, working with references is better (as it appears to be the case when pickling something built on a SwigPyObject).

我已经(大约2年)打算将byref标志公开给pathos内部的dump调用了,但是还没有这样做.这样做应该是相当简单的编辑.我刚刚添加了一张票来这样做: https://github.com/uqfoundation/pathos/第58期.在我看来,打开pathos使用的dumpload函数的替换也应该很容易...这样,您就可以使用自定义的序列化程序(即扩展dill提供的序列化程序,或者使用其他一些序列化程序.

I have been meaning (for ~2 years) to expose the byref flag to the dump call inside of pathos, but have not done so yet. It should be a fairly simple edit to do so. I've just added a ticket to do so: https://github.com/uqfoundation/pathos/issues/58. While I'm at it, it should also be easy to open up replacement of the dump and load functions that pathos uses… that way you could use customized serializers (i.e. extend those that dill provides, or use some other serializer).

这篇关于Python:(Pathos)多处理与类方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆