计算在多处理中执行的任务总数.执行期间的池 [英] Counting total number of tasks executed in a multiprocessing.Pool during execution

查看:86
本文介绍了计算在多处理中执行的任务总数.执行期间的池的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很想总体说明我们目前的谈话.我正在从事农业锻炼,想知道当前的进展.因此,如果将100作业发送到10处理器,如何显示当前已返回的作业数.我可以获取ID,但是如何从我的地图函数计算返回的已完成作业的数量.

I'd love to give an indication of the current talk in total that we are only. I'm farming work out and would like to know current progress. So if I sent 100 jobs to 10 processors, how can I show what the current number of jobs that have returned is. I can get the id's but but how do I count up the number of completed returned jobs from my map function.

我正在按以下方式调用函数:

I'm calling my function as the following:

op_list = pool.map(PPMDR_star, list(varg))

在我的函数中,我可以打印当前名称

And in my function I can print the current name

current = multiprocessing.current_process()
print 'Running: ', current.name, current._identity

推荐答案

如果使用pool.map_async,则可以从

If you use pool.map_async you can pull this information out of the MapResult instance that gets returned. For example:

import multiprocessing
import time

def worker(i):
    time.sleep(i)
    return i


if __name__ == "__main__":
    pool = multiprocessing.Pool()
    result = pool.map_async(worker, range(15))
    while not result.ready():
        print("num left: {}".format(result._number_left))
        time.sleep(1)
    real_result = result.get()
    pool.close()
    pool.join()

输出:

num left: 15
num left: 14
num left: 13
num left: 12
num left: 11
num left: 10
num left: 9
num left: 9
num left: 8
num left: 8
num left: 7
num left: 7
num left: 6
num left: 6
num left: 6
num left: 5
num left: 5
num left: 5
num left: 4
num left: 4
num left: 4
num left: 3
num left: 3
num left: 3
num left: 2
num left: 2
num left: 2
num left: 2
num left: 1
num left: 1
num left: 1
num left: 1

multiprocessing在内部将传递给map的可迭代项分成多个块,并将每个块传递给子进程.因此,_number_left属性实际上会跟踪剩余的 chunks 的数量,而不是可迭代对象中的各个元素.如果使用大型可迭代项时看到奇数,请记住这一点.它使用分块来提高IPC性能,但是如果对您来说完成的结果的准确计数比增加的性能更重要,则可以对map_async使用chunksize=1关键字参数使_num_left更加准确. (chunksize通常只对非常大的可迭代对象产生显着的性能差异.请自己尝试一下,以查看它是否与用例确实相关).

multiprocessing internally breaks the iterable you pass to map into chunks, and passes each chunk to the children processes. So, the _number_left attribute really keeps track of the number of chunks remaining, not the individual elements in the iterable. Keep that in mind if you see odd looking numbers when you use large iterables. It uses chunking to improve IPC performance, but if seeing an accurate tally of completed results is more important to you than the added performance, you can use the chunksize=1 keyword argumment to map_async to make _num_left more accurate. (The chunksize usually only makes a noticable performance difference for very large iterables. Try it for yourself to see if it really matters with your usecase).

正如您在评论中提到的那样,由于pool.map被阻止,除非您要启动一个后台线程来执行轮询,而主线程在map调用中被阻止,否则您将无法真正获得此消息,但是我不确定通过上述方法这样做是否有任何好处.

As you mentioned in the comments, because pool.map is blocking, you can't really get this unless you were to start a background thread that did the polling while the main thread blocked in the map call, but I'm not sure there's any benefit to doing that over the above approach.

要记住的另一件事是,您正在使用内部属性MapResult,因此这可能会在将来的Python版本中中断.

The other thing to keep in mind is that you're using an internal attribute of MapResult, so it's possible that this could break in future versions of Python.

这篇关于计算在多处理中执行的任务总数.执行期间的池的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆