ThreadPoolExecutor().map 与 ThreadPoolExecutor().submit 有何不同? [英] How does ThreadPoolExecutor().map differ from ThreadPoolExecutor().submit?

查看:134
本文介绍了ThreadPoolExecutor().map 与 ThreadPoolExecutor().submit 有何不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是对我写的一些代码感到非常困惑.我惊讶地发现:

I was just very confused by some code that I wrote. I was surprised to discover that:

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(f, iterable))

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    results = list(map(lambda x: executor.submit(f, x), iterable))

产生不同的结果.第一个生成 f 返回的任何类型的列表,第二个生成一个 concurrent.futures.Future 对象列表,然后需要使用它们的 result 进行评估() 方法以获取 f 返回的值.

produce different results. The first one produces a list of whatever type f returns, the second produces a list of concurrent.futures.Future objects that then need to be evaluated with their result() method in order to get the value that f returned.

我主要担心的是,这意味着 executor.map 无法利用 concurrent.futures.as_completed,这似乎是一种非常方便的评估对我正在制作的数据库进行一些长时间调用的结果.

My main concern is that this means that executor.map can't take advantage of concurrent.futures.as_completed, which seems like an extremely convenient way to evaluate the results of some long-running calls to a database that I'm making as they become available.

我完全不清楚 concurrent.futures.ThreadPoolExecutor 对象是如何工作的——天真地,我更喜欢(有点冗长):

I'm not at all clear on how concurrent.futures.ThreadPoolExecutor objects work -- naively, I would prefer the (somewhat more verbose):

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    result_futures = list(map(lambda x: executor.submit(f, x), iterable))
    results = [f.result() for f in futures.as_completed(result_futures)]

更简洁的 executor.map 以利用可能的性能增益.我这样做有错吗?

over the more concise executor.map in order to take advantage of a possible gain in performance. Am I wrong to do so?

推荐答案

问题在于您将 ThreadPoolExecutor.map 的结果转换为列表.如果您不这样做而是直接迭代生成的生成器,则结果仍会按原始顺序生成,但循环会在所有结果准备好之前继续进行.你可以用这个例子来测试:

The problem is that you transform the result of ThreadPoolExecutor.map to a list. If you don't do this and instead iterate over the resulting generator directly, the results are still yielded in the original order but the loop continues before all results are ready. You can test this with this example:

import time
import concurrent.futures

e = concurrent.futures.ThreadPoolExecutor(4)
s = range(10)
for i in e.map(time.sleep, s):
    print(i)

保留顺序的原因可能是因为有时按照您给它们映射的相同顺序获得结果很重要.并且结果可能不会包含在未来的对象中,因为在某些情况下,如果需要,在列表上执行另一个映射以获取所有结果可能需要太长时间.毕竟在大多数情况下,很可能在循环处理第一个值之前下一个值就准备好了.这在此示例中进行了演示:

The reason for the order being kept may be because it's sometimes important that you get results in the same order you give them to map. And results are probably not wrapped in future objects because in some situations it may take just too long to do another map over the list to get all results if you need them. And after all in most cases it's very likely that the next value is ready before the loop processed the first value. This is demonstrated in this example:

import concurrent.futures

executor = concurrent.futures.ThreadPoolExecutor() # Or ProcessPoolExecutor
data = some_huge_list()
results = executor.map(crunch_number, data)
finals = []

for value in results:
    finals.append(do_some_stuff(value))

在这个例子中,do_some_stuff 可能比 crunch_number 花费的时间更长,如果真的是这种情况,当你仍然保持地图的简单使用.

In this example it may be likely that do_some_stuff takes longer than crunch_number and if this is really the case it's really not a big loss of performance while you still keep the easy usage of map.

此外,由于工作线程(/processes)在列表的开头开始处理并一直工作到您提交的列表的末尾,因此结果应该按照迭代器已经产生的顺序完成.这意味着在大多数情况下 executor.map 就可以了,但在某些情况下,例如,如果您处理传递给 map<的值和函数的顺序无关紧要/code> 的运行时间非常不同,future.as_completed 可能会更快.

Also since the worker threads(/processes) start processing at the beginning of the list and work their way to the end to the list you submitted the results should be finished in the order they're already yielded by the iterator. Which means in most cases executor.map is just fine, but in some cases, for example if it doesn't matter in which order you process the values and the function you passed to map takes very different times to run, the future.as_completed may be faster.

这篇关于ThreadPoolExecutor().map 与 ThreadPoolExecutor().submit 有何不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆