Python-Mutliprocess,类的成员函数 [英] Python - Mutliprocess, member functions of classes
问题描述
我不知道这是因为我还是Python2.7具有的多处理模块.谁能弄清楚为什么这行不通?
I can't figure out if this is because of me, or the multiprocessing module that Python2.7 has. Can anyone figure out why this is not working?
from multiprocessing import pool as mp
class encapsulation:
def __init__(self):
self.member_dict = {}
def update_dict(self,index,value):
self.member_dict[index] = value
encaps = encapsulation()
def method(argument):
encaps.update_dict(argument,argument)
print encaps.member_dict
p = mp() #sets up multiprocess pool of processors
p.map(method,sys.argv[1:]) #method is the function, sys.argv is the list of arguments to multiprocess
print encaps.member_dict
>>>{argument:argument}
>>>{}
所以我的问题只是关于成员变量.据我了解,类封装应该在函数的内部和外部保存此字典.为什么即使我只初始化过一次,它也会重置并给我一个空字典呢?请帮助
So my question is just about member variables. It is my understanding that the class encapsulation should hold this dictionary inside and outside of the function. Why does it reset and give me an empty dictionary even though I have only initialized it once? Please help
推荐答案
即使您封装了对象,多处理模块最终仍将在每个进程中使用对象的本地副本,并且永远不会将更改真正传播回给您.在这种情况下,您没有正确使用Pool.map,因为它希望每个方法调用都返回一个结果,然后将其发送回您的返回值.如果要影响共享库,则需要一个管理器,该管理器将协调共享内存:
Even though you are encapsulating the object, the multiprocessing module will end up using a local copy of the object in each process and never actually propagate your changes back to you. In this case, you are not using the Pool.map properly, as it expects each method call to return a result, which is then sent back up to your return value. If what you want is to affect the shared object, then you need a manager, which will coordinate the shared memory:
from multiprocessing import Pool
from multiprocessing import Manager
import sys
class encapsulation:
def __init__(self):
self.member_dict = {}
def update_dict(self,index,value):
self.member_dict[index] = value
encaps = encapsulation()
def method(argument):
encaps.update_dict(argument,argument)
# print encaps.member_dict
manager = Manager()
encaps.member_dict = manager.dict()
p = Pool()
p.map(method,sys.argv[1:])
print encaps.member_dict
输出
$ python mp.py a b c
{'a': 'a', 'c': 'c', 'b': 'b'}
我建议不要将共享库真正设置为成员属性,而是以arg形式传入,或者封装共享库本身,然后将其值传递到字典中.共享对象无法持久保存.需要将其清空并丢弃:
I would suggest not really setting the shared object as the member attribute, but rather passing in as an arg, or encapsulating the shared object itself, and then passing its values into your dict. The shared object cannot be kept persistently. It needs to be emptied and discarded:
# copy the values to a reg dict
encaps.member_dict = encaps.member_dict.copy()
但这可能更好:
class encapsulation:
def __init__(self):
self.member_dict = {}
# normal dict update
def update_dict(self,d):
self.member_dict.update(d)
encaps = encapsulation()
manager = Manager()
results_dict = manager.dict()
# pass in the shared object only
def method(argument):
results_dict[argument] = argument
p = Pool()
p.map(method,sys.argv[1:])
encaps.update_dict(results_dict)
按预期使用pool.map
如果您使用地图返回值,则可能看起来像这样:
Using the pool.map as intended
If you were using the map to return values, it might look like this:
def method(argument):
encaps.update_dict(argument,argument)
return encaps.member_dict
p = Pool()
results = p.map(method,sys.argv[1:])
print results
# [{'a': 'a'}, {'b': 'b'}, {'c': 'c'}]
您需要再次将结果合并到字典中:
You would need to combine the results into your dict again:
for result in results:
encaps.member_dict.update(result)
print encaps.member_dict
# {'a': 'a', 'c': 'c', 'b': 'b'}
这篇关于Python-Mutliprocess,类的成员函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!