Python 3:Pool是否保留传递给map的数据的原始顺序? [英] Python 3: does Pool keep the original order of data passed to map?

查看:263
本文介绍了Python 3:Pool是否保留传递给map的数据的原始顺序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个小脚本来在4个线程之间分配工作负载,并测试结果是否保持有序(关于输入的顺序):

I have written a little script to distribute workload between 4 threads and to test whether the results stay ordered (in respect to the order of the input):

from multiprocessing import Pool
import numpy as np
import time
import random


rows = 16
columns = 1000000

vals = np.arange(rows * columns, dtype=np.int32).reshape(rows, columns)

def worker(arr):
    time.sleep(random.random())        # let the process sleep a random
    for idx in np.ndindex(arr.shape):  # amount of time to ensure that
        arr[idx] += 1                  # the processes finish at different
                                       # time steps
    return arr

# create the threadpool
with Pool(4) as p:
    # schedule one map/worker for each row in the original data
    q = p.map(worker, [row for row in vals])

for idx, row in enumerate(q):
    print("[{:0>2}]: {: >8} - {: >8}".format(idx, row[0], row[-1]))

对我来说,这总是导致:

For me this always results in:

[00]:        1 -  1000000
[01]:  1000001 -  2000000
[02]:  2000001 -  3000000
[03]:  3000001 -  4000000
[04]:  4000001 -  5000000
[05]:  5000001 -  6000000
[06]:  6000001 -  7000000
[07]:  7000001 -  8000000
[08]:  8000001 -  9000000
[09]:  9000001 - 10000000
[10]: 10000001 - 11000000
[11]: 11000001 - 12000000
[12]: 12000001 - 13000000
[13]: 13000001 - 14000000
[14]: 14000001 - 15000000
[15]: 15000001 - 16000000

问题:那么,将每个map函数的结果存储在q中时,Pool真的保持原始输入的顺序吗?

Question: So, does Pool really keep the original input's order when storing the results of each map function in q?

边注:我问这个问题是因为我需要一种简单的方法来并行处理多个工作人员的工作.在某些情况下,排序是不相关的.但是,在某些情况下,必须按原始顺序返回结果(如q一样),因为我使用的是依赖于有序数据的其他reduce函数.

Sidenote: I am asking this, because I need an easy way to parallelize work over several workers. In some cases the ordering is irrelevant. However, there are some cases where the results (like in q) have to be returned in the original order, because I'm using an additional reduce function that relies on ordered data.

性能:在我的计算机上,此操作比单个进程上的正常执行快大约4倍(如预期的那样,因为我有4个内核).此外,在运行时,所有4个内核的使用率均为100%.

Performance: On my machine this operation is about 4 times faster (as expected, since I have 4 cores) than normal execution on a single process. Additionally, all 4 cores are at 100% usage during the runtime.

推荐答案

Pool.map结果是有序的.如果您需要订单,那就太好了;如果不这样做,请 Pool.imap_unordered 可能是有用的优化.

Pool.map results are ordered. If you need order, great; if you don't, Pool.imap_unordered may be a useful optimization.

请注意,虽然从Pool.map接收结果的顺序是固定的,但计算结果的顺序是任意的.

Note that while the order in which you receive the results from Pool.map is fixed, the order in which they are computed is arbitrary.

这篇关于Python 3:Pool是否保留传递给map的数据的原始顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆