为什么多进程在同一进程中运行的东西? [英] Why is multiprocessing running things in the same process?

查看:143
本文介绍了为什么多进程在同一进程中运行的东西?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从运行以下解决方案如何恢复传递给multiprocessing.Process的函数的返回值?

import multiprocessing
from os import getpid

def worker(procnum):
    print('I am number %d in process %d' % (procnum, getpid()))
    return getpid()

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes = 3)
    print(pool.map(worker, range(5)))

它应该输出如下:

I am number 0 in process 19139
I am number 1 in process 19138
I am number 2 in process 19140
I am number 3 in process 19139
I am number 4 in process 19140
[19139, 19138, 19140, 19139, 19140]

但我只能获得

[4212, 4212, 4212, 4212, 4212]

如果我给pool.map提供了一个范围为1,000,000的范围,使用超过10个进程,我最多看到两个不同的pids。

If I feed pool.map a range of 1,000,000 using more than 10 processes I see at most two different pids.

为什么是<

推荐答案

TL; DR :任务不是以任何方式分配的,也许您的任务太短,他们都在其他进程开始之前完成。

TL;DR: tasks are not specifically distributed in any way, perhaps your tasks are so short they are all completed before the other processes get started.

multiprocessing 的来源,似乎任务被简单地放在队列中,工作进程从中读取 worker Pool._inqueue 中读取。没有计算的分布,工人只是尽可能努力地工作。

From looking at the source of multiprocessing, it appears that tasks are simply put in a Queue, which the worker processes read from (function worker reads from Pool._inqueue). There's no calculated distribution going on, the workers just race to work as hard as possible.

最有可能的赌注,将是,因为任务是非常短,所以一个过程完成所有的过程之前,其他人有机会看或甚至开始。您可以通过向任务添加两秒钟睡眠来轻松检查是否是这种情况。

The most likely bet then, would be that as the tasks are simply very short, so one process finishes all of them before the others have a chance to look or even get started. You can easily check if this is the case this by adding a two-second sleep to the task.

将注意到在我的机器上,任务都很均匀地分布在进程上(同样对于#processes> #cores)。所以似乎有一些系统依赖,即使所有进程在工作排队之前应该有 .start() ed。

I'll note that on my machine, the tasks all get spread over the processes pretty homogeneously (also for #processes > #cores). So there seems to be some system-dependence, even though all processes should have .start()ed before work is queued.

这里是来自 worker 的修剪源代码,它显示每个进程只从队列中读取任务,所以以伪随机顺序:

Here's some trimmed source from worker, which shows that the tasks are just read from the queue by each process, so in pseudo-random order:

def worker(inqueue, outqueue, ...):
    ...
    get = inqueue.get
    ...
    while maxtasks is None or (maxtasks and completed < maxtasks):
        try:
            task = get()
        ...

SimpleQueue 使用 Pipe ,从 SimpleQueue 构造函数:

self._reader, self._writer = Pipe(duplex=False)


$ b b

EDIT :可能关于进程开始太慢的部分是false,所以我删除它。在任何工作排队之前,所有进程都是 .start() ed(可能 platform -dependent)。我找不到进程是否准备好了 .start()返回。

EDIT: possibly the part about processes starting too slow is false, so I removed it. All processes are .start()ed before any work is queued (which may be platform-dependent). I can't find whether the process is ready at the moment .start() returns.

这篇关于为什么多进程在同一进程中运行的东西?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆