python多进程同步更新字典 [英] python multiprocess update dictionary synchronously

查看:510
本文介绍了python多进程同步更新字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过多个过程来更新一本普通词典.您能帮我找出这段代码是什么问题吗?我得到以下输出:

I am trying to update one common dictionary through multiple processes. Could you please help me find out what is the problem with this code? I get the following output:

inside function
{1: 1, 2: -1}
comes here
inside function
{1: 0, 2: 2}
comes here
{1: 0, 2: -1}

谢谢.

from multiprocessing import Lock, Process, Manager

l= Lock()


def computeCopyNum(test,val):
    l.acquire()
    test[val]=val
    print "inside function"
    print test
    l.release()
    return

a=dict({1: 0, 2: -1})

procs=list()

for i in range(1,3):
    p = Process(target=computeCopyNum, args=(a,i))
    procs.append(p)
    p.start()

for p in procs:
p.join()
    print "comes here"

print a

推荐答案

答案实际上非常简单.您正在使用多重处理模块,通过该模块可以启动几个不同的python进程.不同的进程具有不同的地址空间,并且它们不共享内存,因此所有进程都将写入其自己的字典本地副本.

The answer is actually quite simple. You're using the multiprocessing module, with which you start several different python processes. Different processes have different address spaces and they do not share memory, so all your processes write to their own local copy of the dictionary.

使用多处理模块时,进行进程间通信的最简单方法是使用队列在从属进程和主进程之间进行通信.

The easiest way to do inter-process communication when using the multiprocessing module is to use a queue to communicate between the slave processes and the master process.

from multiprocessing import Process, Queue

def computeCopyNum(queue, val):
    queue.put(val) # can also put a tuple of thread-id and value if we would like to

procs=list()

queue = Queue()
for i in range(1,3):
    p = Process(target=computeCopyNum, args=(queue, i))
    procs.append(p)
    p.start()

for _ in procs:
    val = queue.get()
    # do whatever with val

for p in procs:
    p.join()

如果每个从属进程都可以生成多个输出值,则明智的做法是让每个从属进程将一个哨兵值写入队列,以向主进程发出已完成的信号.然后,代码可能类似于:

If each slave-process can generate multiple output values it might be prudent to let each slave-process write a sentinel-value to the queue to signal to the master that it's done. Then the code might look something like:

def slave(queue):
    for i in range(128): # just for example
        val = #some calculated result
        queue.put(val)

    queue.put(None) # add a sentinel value to tell the master we're done

queue = Queue()

# spawn 32 slave processes
num_procs = 32
procs = [Process(target=slave, args=(queue, )) for _ in range(num_procs)]
for proc in procs: 
    proc.start()

finished = 0
while finished < num_procs:
    item = queue.get()
    if item is None: 
        finished += 1
    else: 
        # do something with item

for proc in procs: 
    proc.join()

您也可以使用管理器,如另一个答案所示.这种方法的问题在于,在进程地址空间之间可能会发生很多隐式内存复制,并且这可能很难推理.我总是喜欢使用显式队列.

You can also use a Manager, as shown in another answer. The problem with that approach is that a lot of implicit memory copying between process address spaces might occur, and that can be hard to reason about. I always prefer using explicit queues.

这篇关于python多进程同步更新字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆