共享列表的多处理 [英] Multiprocessing of shared list

查看:69
本文介绍了共享列表的多处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了这样的程序:

from multiprocessing import Process, Manager

def worker(i):
    x[i].append(i)

if __name__ == '__main__':
    manager = Manager()
    x = manager.list()
    for i in range(5):
        x.append([])
    p = []
    for i in range(5):
        p.append(Process(target=worker, args=(i,)))
        p[i].start()

    for i in range(5):
        p[i].join()

    print x

我想在进程之间创建一个共享列表列表,并且每个进程都在其中修改一个列表.但是该程序的结果是一个空列表的列表:[[],[],[],[],[]].

I want to create a shared list of lists among processes and each process modify a list in it. But the result of this program is a list of empty lists: [[],[],[],[],[]].

怎么了?

推荐答案

我认为这是由于实施管理器的方式有些古怪.

I think this is because of quirk in the way Managers are implemented.

如果创建两个Manager.list对象,然后将其中一个列表追加到另一个列表中,则要追加的列表类型将在父列表中更改:

If you create two Manager.list objects, and then append one of the lists to the other, the type of the list that you append changes inside the parent list:

>>> type(l)
<class 'multiprocessing.managers.ListProxy'>
>>> type(z)
<class 'multiprocessing.managers.ListProxy'>
>>> l.append(z)
>>> type(l[0])
<class 'list'>   # Not a ListProxy anymore

l[0]z不是同一对象,并且其行为也不符合您期望的结果:

l[0] and z are not the same object, and don't behave quite the way you'd expect as a result:

>>> l[0].append("hi")
>>> print(z)
[]
>>> z.append("hi again")
>>> print(l[0])
['hi again']

如您所见,更改嵌套列表对ListProxy对象没有任何影响,但是更改ListProxy对象确实会更改嵌套列表.该文档实际上明确指出了这一点:

As you can see, changing the nested list doesn't have any effect on the ListProxy object, but changing the ListProxy object does change the nested list. The documentation actually explicitly notes this:

注意

对dict和list代理中的可变值或项目的修改将 不能通过管理器传播,因为代理无法 知道何时修改其值或项目.要修改这样的项目, 您可以将修改后的对象重新分配给容器代理:

Modifications to mutable values or items in dict and list proxies will not be propagated through the manager, because the proxy has no way of knowing when its values or items are modified. To modify such an item, you can re-assign the modified object to the container proxy:

翻阅源代码,您可以看到,当您在ListProxy上调用append时,附加调用实际上是通过IPC发送到管理器对象的,然后该管理器调用附加在共享列表上.这意味着append的args需要被腌制/去腌制.在取消提取过程中,ListProxy对象被转换为常规的Python列表,该列表是ListProxy指向的对象(也称为其引用对象)的副本.这也是文档中所指出的:

Digging through the source code, you can see that when you call append on a ListProxy, the append call is actually sent to a manager object via IPC, and then the manager calls append on the shared list. That means that the args to append need to get pickled/unpickled. During the unpickling process, the ListProxy object gets turned into a regular Python list, which is a copy of what the ListProxy was pointing to (aka its referent). This is also noted in the documentation:

代理对象的一个​​重要特征是它们是可腌制的,因此 它们可以在流程之间传递.但是请注意,如果代理 被发送到相应经理的流程,然后取消选择 产生引用对象本身.例如,这意味着一个共享对象可以包含第二个

An important feature of proxy objects is that they are picklable so they can be passed between processes. Note, however, that if a proxy is sent to the corresponding manager’s process then unpickling it will produce the referent itself. This means, for example, that one shared object can contain a second

因此,回到上面的示例,如果l [0]是z的副本,为什么更新z也会更新l[0]?由于副本也已在Proxy对象中注册,因此,当您更改ListProxy(在上例中为z)时,它还将更新列表的所有已注册副本(在上例中为l[0]).但是,副本对代理一无所知,因此,当您更改副本时,代理不会更改.

So, going back to the example above, if l[0] is a copy of z, why does updating z also update l[0]? Because the copy also gets registered with the Proxy object, so, that when you change the ListProxy (z in the example above), it also updates all the registered copies of the list (l[0] in the example above). However, the copy knows nothing about the proxy, so when you change the copy, the Proxy doesn't change.

因此,为了使您的示例正常工作,您每次需要修改子列表时都需要创建一个新的manager.list()对象,并且仅直接更新该代理对象,而不是通过父级索引来更新它列表:

So, in order to make your example work, you need to create a new manager.list() object every time you want to modify a sublist, and only update that proxy object directly, rather than updating it via the index of the parent list:

#!/usr/bin/python

from multiprocessing import Process, Manager

def worker(x, i, *args):
    sub_l = manager.list(x[i])
    sub_l.append(i)
    x[i] = sub_l


if __name__ == '__main__':
    manager = Manager()
    x = manager.list([[]]*5)
    print x
    p = []
    for i in range(5):
        p.append(Process(target=worker, args=(x, i)))
        p[i].start()

    for i in range(5):
        p[i].join()

    print x

这是输出:

dan@dantop2:~$ ./multi_weirdness.py 
[[0], [1], [2], [3], [4]]

这篇关于共享列表的多处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆