在Python多处理中共享可变全局变量 [英] Sharing mutable global variable in Python multiprocessing.Pool
问题描述
我正在尝试使用以下代码更新共享库(dict
).但这是行不通的.它给了我输入dict
作为输出.
I'm trying to update a shared object (a dict
) using the following code. But it does not work. It gives me the input dict
as an output.
编辑:通常,我想在这里实现的是将数据(列表)中的项目添加到字典列表中.数据项在字典中给出索引.
Edit: Exxentially, What I'm trying to achieve here is to append items in the data (a list) to the dict's list. Data items give indices in the dict.
预期输出:{'2': [2], '1': [1, 4, 6], '3': [3, 5]}
注意:方法2引发错误TypeError: 'int' object is not iterable
Expected output: {'2': [2], '1': [1, 4, 6], '3': [3, 5]}
Note: Approach 2 raise error TypeError: 'int' object is not iterable
-
方法1
from multiprocessing import *
def mapTo(d,tree):
for idx, item in enumerate(list(d), start=1):
tree[str(item)].append(idx)
data=[1,2,3,1,3,1]
manager = Manager()
sharedtree= manager.dict({"1":[],"2":[],"3":[]})
with Pool(processes=3) as pool:
pool.starmap(mapTo, [(data,sharedtree ) for _ in range(3)])
from multiprocessing import *
def mapTo(d):
global tree
for idx, item in enumerate(list(d), start=1):
tree[str(item)].append(idx)
def initializer():
global tree
tree = dict({"1":[],"2":[],"3":[]})
data=[1,2,3,1,3,1]
with Pool(processes=3, initializer=initializer, initargs=()) as pool:
pool.map(mapTo,data)```
推荐答案
如果要反映更改,则需要使用托管列表.因此,以下对我有用:
You need to use managed lists if you want the changes to be reflected. So, the following works for me:
from multiprocessing import *
def mapTo(d,tree):
for idx, item in enumerate(list(d), start=1):
tree[str(item)].append(idx)
if __name__ == '__main__':
data=[1,2,3,1,3,1]
with Pool(processes=3) as pool:
manager = Manager()
sharedtree= manager.dict({"1":manager.list(), "2":manager.list(),"3":manager.list()})
pool.starmap(mapTo, [(data,sharedtree ) for _ in range(3)])
print({k:list(v) for k,v in sharedtree.items()})
这是输出:
{'1': [1, 1, 1, 4, 4, 4, 6, 6, 6], '2': [2, 2, 2], '3': [3, 3, 5, 3, 5, 5]}
请注意,在进行多处理时,应始终使用if __name__ == '__main__':
防护,同时,也要避免加星标...
Note, you should always use the if __name__ == '__main__':
guard when using multiprocessing, also, avoid starred imports...
如果您使用的是Python< 3.6,因此将其用于mapTo
:
You have to do this re-assignment if you are on Python < 3.6, so use this for mapTo
:
def mapTo(d,tree):
for idx, item in enumerate(list(d), start=1):
l = tree[str(item)]
l.append(idx)
tree[str(item)] = l
最后,您没有正确使用starmap
/map
,您将数据传递了3次,因此,当然,所有数据都被计算了3次.映射操作应在要映射的数据的每个单独元素上进行,因此您需要执行以下操作:
And finally, you aren't using starmap
/map
correctly, you are passing the data three times, so of course, everything gets counted three times. A mapping operation should work on each individual element of the data you are mapping over, so you want something like:
from functools import partial
from multiprocessing import *
def mapTo(i_d,tree):
idx,item = i_d
l = tree[str(item)]
l.append(idx)
tree[str(item)] = l
if __name__ == '__main__':
data=[1,2,3,1,3,1]
with Pool(processes=3) as pool:
manager = Manager()
sharedtree= manager.dict({"1":manager.list(), "2":manager.list(),"3":manager.list()})
pool.map(partial(mapTo, tree=sharedtree), list(enumerate(data, start=1)))
print({k:list(v) for k,v in sharedtree.items()})
这篇关于在Python多处理中共享可变全局变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!