Python multiprocessing.Queue与multiprocessing.manager().Queue() [英] Python multiprocessing.Queue vs multiprocessing.manager().Queue()

查看:1150
本文介绍了Python multiprocessing.Queue与multiprocessing.manager().Queue()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的任务,

def worker(queue):
    while True:
        try:
            _ = queue.get_nowait()
        except Queue.Empty:
            break

if __name__ == '__main__':
    manager = multiprocessing.Manager()
    # queue = multiprocessing.Queue()
    queue = manager.Queue()

    for i in range(5):
        queue.put(i)

    processes = []

    for i in range(2):
        proc = multiprocessing.Process(target=worker, args=(queue,))
        processes.append(proc)
        proc.start()

    for proc in processes:
        proc.join()

似乎multiprocessing.Queue可以完成我需要的所有工作,但是另一方面,我看到了很多manager().Queue()的示例,并且无法理解我的真正需求.看起来Manager().Queue()使用某种代理对象,但我不明白这些目的,因为multiprocessing.Queue()在没有任何代理对象的情况下完成了相同的工作.

It seems that multiprocessing.Queue can do all work that i needed, but on the other hand I see many examples of manager().Queue() and can't understand what I really need. Looks like Manager().Queue() use some sort of proxy objects, but I doesn't understand those purpose, because multiprocessing.Queue() do the same work without any proxy objects.

所以,我的问题是:

1)multiprocessing.Queue和multiprocessing.manager().Queue()返回的对象之间的真正区别是什么?

1) What really difference between multiprocessing.Queue and object returned by multiprocessing.manager().Queue()?

2)我需要使用什么?

2) What do I need to use?

推荐答案

尽管我对此主题的理解有限,但是从我的理解中我可以看出multiprocessing.Queue()和multiprocessing.Manager()之间有一个主要区别.Queue():

Though my understanding is limited about this subject, from what I did I can tell there is one main difference between multiprocessing.Queue() and multiprocessing.Manager().Queue():

  • multiprocessing.Queue()是一个对象,而multiprocessing.Manager().Queue()是指向由multiprocessing.Manager()对象管理的共享队列的地址(代理).
  • 因此,您无法将普通的multiprocessing.Queue()对象传递给Pool方法,因为它不能被腌制.
  • 此外, Python文档告诉我们在使用多处理时要特别注意. Queue(),因为它会产生不良影响
  • multiprocessing.Queue() is an object whereas multiprocessing.Manager().Queue() is an address (proxy) pointing to shared queue managed by the multiprocessing.Manager() object.
  • therefore you can't pass normal multiprocessing.Queue() objects to Pool methods, because it can't be pickled.
  • Moreover the python doc tells us to pay particular attention when using multiprocessing.Queue() because it can have undesired effects

注意:将对象放入队列后,将其腌制,然后后台线程将腌制的数据刷新到基础管道.这会带来一些令人惊讶的后果,但不会造成任何实际困难-如果它们确实困扰您,那么您可以使用由经理创建的队列. 将对象放在空队列上之后,在队列的empty()方法返回False且get_nowait()可以返回而不提高Queue.Empty之前,可能会有无穷的延迟. 如果多个进程正在排队对象,则有可能在另一端无序接收对象.但是,通过相同过程排队的对象将始终相对于彼此按预期顺序排列.

Note When an object is put on a queue, the object is pickled and a background thread later flushes the pickled data to an underlying pipe. This has some consequences which are a little surprising, but should not cause any practical difficulties – if they really bother you then you can instead use a queue created with a manager. After putting an object on an empty queue there may be an infinitesimal delay before the queue’s empty() method returns False and get_nowait() can return without raising Queue.Empty. If multiple processes are enqueuing objects, it is possible for the objects to be received at the other end out-of-order. However, objects enqueued by the same process will always be in the expected order with respect to each other.

警告如上所述,如果子进程已将项目放入队列中(并且未使用JoinableQueue.cancel_join_thread),则该进程将不会终止,直到所有缓冲的项目都已刷新到管道. 这意味着,如果您尝试加入该进程,则可能会陷入僵局,除非您确定已放入队列中的所有项目都已消耗完.同样,如果子进程是非守护进程,则当父进程尝试加入其所有非守护进程子进程时,其父进程可能会在退出时挂起. 请注意,使用管理器创建的队列不存在此问题.

Warning As mentioned above, if a child process has put items on a queue (and it has not used JoinableQueue.cancel_join_thread), then that process will not terminate until all buffered items have been flushed to the pipe. This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children. Note that a queue created using a manager does not have this issue.

有一种变通办法,可以通过将队列设置为全局变量并在初始化时为所有进程设置它,来将Pool与multiprocessing.Queue()一起使用:

There is a workaround to use multiprocessing.Queue() with Pool by setting the queue as a global variable and setting it for all processes at initialization :

queue = multiprocessing.Queue()
def initialize_shared(q):
    global queue
    queue=q

pool= Pool(nb_process,initializer=initialize_shared, initargs(queue,))

将创建具有正确共享队列的池进程,但是我们可以争辩说不是为此用途创建了multiprocessing.Queue()对象.

will create pool processes with correctly shared queues but we can argue that the multiprocessing.Queue() objects were not created for this use.

另一方面,可以将pool.manager.Queue()作为函数的正常参数传递,从而在池子进程之间共享manager.Queue().

On the other hand the manager.Queue() can be shared between pool subprocesses by passing it as normal argument of a function.

我认为,在每种情况下都可以使用multiprocessing.Manager().Queue(),而且麻烦也较少.使用管理器可能会有一些弊端,但我不知道.

In my opinion, using multiprocessing.Manager().Queue() is fine in every case and less troublesome. There might be some drawbacks using a manager but I'm not aware of it.

这篇关于Python multiprocessing.Queue与multiprocessing.manager().Queue()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆