无法使用multiprocessing.Pool将参数传递给方法 [英] Cannot pass argument to method using multiprocessing.Pool
问题描述
我的程序有几个参数,其中一个称为challenges
,它从命令行接收整数值.我想通过将challenges
的值传递给自定义方法generation
来使用multiprocessing
:
My program takes several arguments where one of them is called challenges
which receives integer value from the command line. I want to use multiprocessing
by passing the value of challenges
to a self-defined method generation
:
import multiprocessing
gen = generator.GenParentClass()
mlp = multiprocessing.Pool(processes=multiprocessing.cpu_count())
X, y = mlp.imap_unordered(gen.generation, [args.challenges])
类GenParentClass
中的方法generation
具有以下简单签名:
The method generation
in class GenParentClass
has this simple signature:
def generation(self, num):
#some stuff
但是,出现此错误:
Traceback (most recent call last):
File "experiments.py", line 194, in <module>
X, y = mlp.imap_unordered(gen.generation, [args.challenges])
File "/anaconda/lib/python2.7/multiprocessing/pool.py", line 668, in next
raise value
cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed
我不知道如何解决这个问题.在我看来一切都正确!感谢您的任何帮助.
I don't know how to solve this problem. Everything seems to me correct!! Any help os appreciated.
推荐答案
multiprocess
模块将自变量(pickles
)序列化为imap_unordered
.看来函数gen.generation
是实例方法(在类中定义),这意味着它不能被腌制,因此会出错.
The multiprocess
module serializes (pickles
) the arguments to imap_unordered
. It appears that the function gen.generation
is an instance method (defined within a class), which means it cannot be pickled, hence your error.
这是一个可能的解决方法,它在类外部定义函数,并向该函数添加其他参数,这些参数使用itertools
中的partial
填充:
Here is a possible workaround that defines the function outside of the class, and adds additional argument(s) to that function, which are filled in using partial
from itertools
:
import multiprocessing
from functools import partial
class GenParentClass(object):
a = None
b = None
def __init__(self, a, b):
self.a = a
self.b = b
# define this outside of GenParentClass (i.e., top level function)
def generation(num, x, y):
return x+y+num
gen = GenParentClass(3, 5)
mlp = multiprocessing.Pool(processes=multiprocessing.cpu_count())
R = mlp.imap_unordered(partial(generation, x=gen.a, y=gen.b), [1,2,3])
print([r for r in R]) # prints "[9, 10, 11]"
More information on pickle-ability is available here.
可在此处获得有关functools的更多信息.
More information on functools is available here.
如果使用multiprocess.Pool
并且函数定义使用限定的变量名称self.a
和self.b
,则可以执行此操作,而无需在类外重写函数,但是将无法检索到该变量.输出,并且gen2的状态将不会更改(根本无法实现调用该函数的目的).
Edit 2: If you use multiprocess.Pool
and the function definition uses qualified variable names self.a
and self.b
, you can do this without rewriting the function outside the class, but you won't be able to retrieve the output, and the state of gen2 will not change (defeating the purpose of calling the function at all).
gen2 = GenParentClass(4, 6)
p = {}
for key in range(5):
p[key] = multiprocessing.Process(target = GenParentClass.altgen, args = (gen2, key,))
p[key].start()
for key in p:
p[key].join()
这篇关于无法使用multiprocessing.Pool将参数传递给方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!