从python多重处理函数返回的多个输出 [英] multiple output returned from python multiprocessing function

查看:105
本文介绍了从python多重处理函数返回的多个输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用多重处理来返回列表,但是我没有等到所有进程都完成,而是从mp_factorizer中的一个return语句中获得了多个返回,如下所示:

I am trying to use multiprocessing to return a list, but instead of waiting until all processes are done, I get several returns from one return statement in mp_factorizer, like this:

None
None
(returns list)

在此示例中,我使用了2个线程.如果我使用了5个线程,则在列出该列表之前将有5个None返回.这是代码:

in this example I used 2 threads. If I used 5 threads, there would be 5 None returns before the list is being put out. Here is the code:

def mp_factorizer(nums, nprocs, objecttouse):
    if __name__ == '__main__':
        out_q = multiprocessing.Queue()
        chunksize = int(math.ceil(len(nums) / float(nprocs)))
        procs = []
        for i in range(nprocs):
            p = multiprocessing.Process(
                    target=worker,                   
                    args=(nums[chunksize * i:chunksize * (i + 1)],
                          out_q,
                    objecttouse))
            procs.append(p)
            p.start()

        # Collect all results into a single result dict. We know how many dicts
        # with results to expect.
        resultlist = []
        for i in range(nprocs):
            temp=out_q.get()
            index =0
            for i in temp:
                resultlist.append(temp[index][0][0:])
                index +=1

        # Wait for all worker processes to finish
        for p in procs:
            p.join()
            resultlist2 = [x for x in resultlist if x != []]
        return resultlist2

def worker(nums, out_q, objecttouse):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outlist = []
    for n in nums:        
        outputlist=objecttouse.getevents(n)
        if outputlist:
            outlist.append(outputlist)   
    out_q.put(outlist)

mp_factorizer获取项目列表,线程数和工作程序应使用的对象,然后将项目列表拆分,以便所有线程获得相等数量的列表,并启动工作程序. 然后,工作人员使用该对象从给定列表中计算出一些内容,然后将结果添加到队列中. Mp_factorizer应该从队列中收集所有结果,将它们合并到一个大列表中并返回该列表.但是-我得到了很多回报.

mp_factorizer gets a list of items, # of threads, and an object that the worker should use, it then splits up the list of items so all threads get an equal amount of the list, and starts the workers. The workers then use the object to calculate something from the given list, add the result to the queue. Mp_factorizer is supposed to collect all results from the queue, merge them to one large list and return that list. However - I get multiple returns.

我做错了什么?还是这种预期的行为是由于Windows处理多处理的奇怪方式引起的? (Python 2.7.3,Windows7 64位)

What am I doing wrong? Or is this expected behavior due to the strange way windows handles multiprocessing? (Python 2.7.3, Windows7 64bit)

问题是if __name__ == '__main__':的位置错误.我在解决另一个问题时发现了问题,请参阅在子流程中使用多重处理以获得完整的解释.

The problem was the wrong placement of if __name__ == '__main__':. I found out while working on another problem, see using multiprocessing in a sub process for a complete explanation.

推荐答案

if __name__ == '__main__'放在错误的位置.一个快速的解决方案是仅保护对mp_factorizer的调用,如Janne Karila建议的那样:

if __name__ == '__main__' is in the wrong place. A quick fix would be to protect only the call to mp_factorizer like Janne Karila suggested:

if __name__ == '__main__':
    print mp_factorizer(list, 2, someobject)

但是,在Windows上,主文件将在执行时执行一次,而每个工作线程将执行一次,在这种情况下为2.因此,这将是主线程的总共3次执行,不包括代码的受保护部分.

However, on windows the main file will be executed once on execution + once for every worker thread, in this case 2. So this would be a total of 3 executions of the main thread, excluding the protected part of the code.

在同一主线程中进行其他计算时,这可能会导致问题,并且至少会不必要地降低性能.即使仅应多次执行worker函数,但在Windows中将执行不受if __name__ == '__main__'保护的所有内容.

This can cause problems as soon as there are other computations being made in the same main thread, and at the very least unnecessarily slow down performance. Even though only the worker function should be executed several times, in windows everything will be executed thats not protected by if __name__ == '__main__'.

因此解决方案将是通过仅在执行完所有代码后保护整个主进程 if __name__ == '__main__'.

So the solution would be to protect the whole main process by executing all code only after if __name__ == '__main__'.

但是,如果worker函数位于相同文件中,则需要将此if语句排除在外,因为否则无法多次调用它进行多重处理.

If the worker function is in the same file, however, it needs to be excluded from this if statement because otherwise it can not be called several times for multiprocessing.

伪代码主线程:

# Import stuff
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
    #there is no worker function code here, it's in another file.

即使整个主进程都受到保护,只要它在另一个文件中,都可以启动worker函数.

Even though the whole main process is protected, the worker function can still be started, as long as it is in another file.

伪代码主线程,具有辅助函数:

Pseudocode main thread, with worker function:

# Import stuff
#If the worker code is in the main thread, exclude it from the if statement:
def worker():
    #worker code
if __name__ == '__main__':
    #execute whatever you want, it will only be executed 
    #as often as you intend it to
    #execute the function that starts multiprocessing, 
    #in this case mp_factorizer()
#All code outside of the if statement will be executed multiple times
#depending on the # of assigned worker threads.

有关可运行代码的详细说明,请参见在子流程中使用多处理 a>

For a longer explanation with runnable code, see using multiprocessing in a sub process

这篇关于从python多重处理函数返回的多个输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆