在队列为空之前调用join时,Python 3多处理队列死锁 [英] Python 3 Multiprocessing queue deadlock when calling join before the queue is empty

查看:193
本文介绍了在队列为空之前调用join时,Python 3多处理队列死锁的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在理解python 3的multiprocessing模块中的队列时遇到了一个问题

I have a question understanding the queue in the multiprocessing module in python 3

这就是他们在编程指南中所说的:

请记住,将项目放入队列的过程将等待 终止,直到所有缓冲的项目由"feeder"线程提供给 基础管道. (子进程可以调用 Queue.cancel_join_thread 避免这种行为的队列方法.)

Bear in mind that a process that has put items in a queue will wait before terminating until all the buffered items are fed by the “feeder” thread to the underlying pipe. (The child process can call the Queue.cancel_join_thread method of the queue to avoid this behaviour.)

这意味着每当您使用队列时,都需要确保所有 放在队列中的项目最终将在删除之前 流程已加入.否则,您不能确定哪个进程具有 将项目放在队列中将终止.还请记住,非守护程序 进程将自动加入.

This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be joined automatically.

将导致死锁的示例如下:

An example which will deadlock is the following:

from multiprocessing import Process, Queue

def f(q):
    q.put('X' * 1000000)

if __name__ == '__main__':
    queue = Queue()
    p = Process(target=f, args=(queue,))
    p.start()
    p.join()                    # this deadlocks
    obj = queue.get()

这里的解决方法是交换最后两行(或简单地删除 p.join()行).

A fix here would be to swap the last two lines (or simply remove the p.join() line).

因此,显然,不应在join()之后调用queue.get().

So apparently, queue.get() should not be called after a join().

但是,有一些使用队列的示例,其中在join之后调用get,例如:

However there are examples of using queues where get is called after a join like:

import multiprocessing as mp
import random
import string

# define a example function
def rand_string(length, output):
    """ Generates a random string of numbers, lower- and uppercase chars. """
    rand_str = ''.join(random.choice(
                string.ascii_lowercase
                + string.ascii_uppercase
                + string.digits)
    for i in range(length))
        output.put(rand_str)

 if __name__ == "__main__":
     # Define an output queue
     output = mp.Queue()

     # Setup a list of processes that we want to run
     processes = [mp.Process(target=rand_string, args=(5, output))
                    for x in range(2)]

     # Run processes
    for p in processes:
        p.start()

    # Exit the completed processes
    for p in processes:
        p.join()

    # Get process results from the output queue
    results = [output.get() for p in processes]

    print(results)

我已经运行了该程序,并且可以正常工作(也作为StackOverFlow问题的解决方案发布

I've run this program and it works (also posted as a solution to the StackOverFlow question Python 3 - Multiprocessing - Queue.get() does not respond).

有人可以帮我理解僵局的规则在哪里吗?

Could someone help me understand what the rule for the deadlock is here?

推荐答案

允许在进程之间传输数据的多处理队列实现依赖于标准OS管道.

The queue implementation in multiprocessing that allows data to be transferred between processes relies on standard OS pipes.

OS管道的长度不是无限长,因此在put()操作期间可能会在OS中阻止将数据排队的进程,直到其他进程使用get()从队列中检索数据为止.

OS pipes are not infinitely long, so the process which queues data could be blocked in the OS during the put() operation until some other process uses get() to retrieve data from the queue.

对于少量数据(例如您的示例中的数据),主流程可以join()所有产生的子流程,然后获取数据.这通常效果很好,但无法扩展,并且不清楚何时会破裂.

For small amounts of data, such as the one in your example, the main process can join() all the spawned subprocesses and then pick up the data. This often works well, but does not scale, and it is not clear when it will break.

但是肯定会破坏大量数据.子进程将在put()中被阻塞,等待主进程使用get()从队列中删除某些数据,但是主进程在join()中被阻塞,等待子进程完成.这会导致死锁.

But it will certainly break with large amounts of data. The subprocess will be blocked in put() waiting for the main process to remove some data from the queue with get(), but the main process is blocked in join() waiting for the subprocess to finish. This results in a deadlock.

在此示例中,用户有此确切的问题.我在答案中张贴了一些代码,可以帮助他解决问题.

Here is an example where a user had this exact issue. I posted some code in an answer there that helped him solve his problem.

这篇关于在队列为空之前调用join时,Python 3多处理队列死锁的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆