Python:多进程工作程序,跟踪已完成的任务(缺少完成) [英] Python: multiprocess workers, tracking tasks completed (missing completions)

查看:114
本文介绍了Python:多进程工作程序,跟踪已完成的任务(缺少完成)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

默认的multiprocessing.Pool代码包括一个计数器,用于跟踪工作人员已完成的任务数:

The default multiprocessing.Pool code includes a counter to keep track of the number of tasks a worker has completed:

    completed += 1
logging.debug('worker exiting after %d tasks' % completed)

但是从range(12)range(20)pool.map会导致计数器错误(这似乎与创建工作程序无关).我也不十分清楚是什么原因造成的.

But going up from range(12) to range(20) a pool.map leads to errors in the counter (this appears to be unrelated to worker creation). I am not really clear on what's causing this either.

例如:

import multiprocessing as mp

def ret_x(x): 
    return x
def inform():
    print('made a worker!')
pool  = mp.Pool(2, maxtasksperchild=2, initializer=inform)
res= pool.map(ret_x, range(8))
print(res)

可以正常工作,提供:

made a worker!
made a worker!
worker exiting after 2 tasks
worker exiting after 2 tasks
made a worker!
worker exiting after 2 tasks
made a worker!
worker exiting after 2 tasks
[0, 1, 2, 3, 4, 5, 6, 7]

但是将range更改为20并不会显示正在创建任何其他工作程序,也不会显示总共20个已完成的任务,即使已完成的范围已在预期列表中返回.

But changing the range to 20 doesn't show any additional workers being created or a total of 20 completed tasks, even though the finished range is returned in the expected list.

made a worker!
made a worker!
worker exiting after 2 tasks
worker exiting after 2 tasks
made a worker!
worker exiting after 2 tasks
made a worker!
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
worker exiting after 1 tasks

推荐答案

之所以如此,是因为您没有在pool.map中明确定义块大小":

It works that way because you are not defining explicitly "chunksize" in pool.map:

map(func, iterable[, chunksize])

此方法将迭代器切成许多块, 作为单独的任务提交到流程池. (大约)大小 这些块中的一个可以通过将chunksize设置为正数来指定 整数

This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer

来源: https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool

对于8个项目,考虑len(pool)= 2,chunksize将为1(divmod(8,2 * 4)),因此您看到(8/1)/2个工作人员= 4个工作人员

For 8 items, considering a len(pool)=2, chunksize will be 1 ( divmod(8,2*4)) so you see (8/1)/2 workers = 4 workers

workers = (len of items / chunksize) /  tasks per process

对于20个项目,考虑len(pool)= 2,chunksize将为3(divmode(20,2 * 4)),因此您会看到类似(20/3)/2 = 3.3 worker

For 20 items, considering a len(pool)=2, chunksize will be 3 (divmode(20,2*4)) so you see something like (20/3)/2 = 3.3 workers

对于40 ... chunksize = 5,工人=(40/5)/5 = 4个工人

For 40...chunksize=5, workers= (40/5)/5 = 4 workers

如果需要,可以设置chunksize = 1

If you want, you can set chunksize=1

res = pool.map(ret_x, range(40), 1)

您将看到(20/1)/2 = 10个工人

And you will see (20/1)/2 = 10 workers

python mppp.py
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

因此,chunksize就像一个流程的单位工作量……之类.

So chunksize is like the amount of unit work for a process...or something like that.

如何计算块大小: https://hg .python.org/cpython/file/1c54def5947c/Lib/multiprocessing/pool.py#l305

这篇关于Python:多进程工作程序,跟踪已完成的任务(缺少完成)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆