Python 3 中的 Concurrent.futures 与多处理 [英] Concurrent.futures vs Multiprocessing in Python 3

查看:32
本文介绍了Python 3 中的 Concurrent.futures 与多处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Python 3.2 引入了 Concurrent Futures,看起来有些高级旧线程和 multiprocessing 模块的组合.

Python 3.2 introduced Concurrent Futures, which appear to be some advanced combination of the older threading and multiprocessing modules.

与旧的多处理模块相比,将其用于 CPU 密集型任务的优点和缺点是什么?

What are the advantages and disadvantages of using this for CPU bound tasks over the older multiprocessing module?

这篇文章 表明它们更容易使用 - 是这样吗?

This article suggests they're much easier to work with - is that the case?

推荐答案

我不会将 concurrent.futures 称为更高级"的- 这是一个更简单的界面,无论您使用多线程还是多进程作为底层并行化噱头,其工作方式都非常相似.

I wouldn't call concurrent.futures more "advanced" - it's a simpler interface that works very much the same regardless of whether you use multiple threads or multiple processes as the underlying parallelization gimmick.

因此,就像几乎所有更简单的界面"实例一样,涉及到几乎相同的权衡:它的学习曲线更浅,这在很大程度上只是因为可用的东西少得多被学习;但是,由于它提供的选项较少,最终可能会以更丰富的界面所不会的方式让您感到沮丧.

So, like virtually all instances of "simpler interface", much the same trade-offs are involved: it has a shallower learning curve, in large part just because there's so much less available to be learned; but, because it offers fewer options, it may eventually frustrate you in ways the richer interfaces won't.

就 CPU 密集型任务而言,这太不明确了,说不出多大意义.对于 CPython 下的 CPU 密集型任务,您需要多个进程而不是多个线程才能获得加速.但是您获得的加速程度(如果有的话)取决于您的硬件、操作系统的详细信息,尤其是您的特定任务需要多少进程间通信.在幕后,所有进程间并行化噱头都依赖于相同的操作系统原语 - 您用来实现这些原语的高级 API 并不是影响底线速度的主要因素.

So far as CPU-bound tasks go, that's way too under-specified to say much meaningful. For CPU-bound tasks under CPython, you need multiple processes rather than multiple threads to have any chance of getting a speedup. But how much (if any) of a speedup you get depends on the details of your hardware, your OS, and especially on how much inter-process communication your specific tasks require. Under the covers, all inter-process parallelization gimmicks rely on the same OS primitives - the high-level API you use to get at those isn't a primary factor in bottom-line speed.

示例

这是您引用的文章中显示的最终代码,但我添加了使其工作所需的导入语句:

Here's the final code shown in the article you referenced, but I'm adding an import statement needed to make it work:

from concurrent.futures import ProcessPoolExecutor
def pool_factorizer_map(nums, nprocs):
    # Let the executor divide the work among processes by using 'map'.
    with ProcessPoolExecutor(max_workers=nprocs) as executor:
        return {num:factors for num, factors in
                                zip(nums,
                                    executor.map(factorize_naive, nums))}

这里使用 multiprocessing 完全相同:

Here's exactly the same thing using multiprocessing instead:

import multiprocessing as mp
def mp_factorizer_map(nums, nprocs):
    with mp.Pool(nprocs) as pool:
        return {num:factors for num, factors in
                                zip(nums,
                                    pool.map(factorize_naive, nums))}

请注意,在 Python 3.3 中添加了使用 multiprocessing.Pool 对象作为上下文管理器的功能.

Note that the ability to use multiprocessing.Pool objects as context managers was added in Python 3.3.

至于哪个更容易使用,它们本质上是相同的.

As for which one is easier to work with, they're essentially identical.

一个区别是 Pool 支持许多不同的做事方式,您可能没有意识到它可以是多么容易,直到您爬了很长一段路学习曲线.

One difference is that Pool supports so many different ways of doing things that you may not realize how easy it can be until you've climbed quite a way up the learning curve.

同样,所有这些不同的方式既是优点也是缺点.它们是一种优势,因为在某些情况下可能需要灵活性.它们是一个弱点,因为最好只有一种明显的方法".从长远来看,一个完全(如果可能)坚持 concurrent.futures 的项目可能更容易维护,因为它的最小 API 的使用方式缺乏新奇.

Again, all those different ways are both a strength and a weakness. They're a strength because the flexibility may be required in some situations. They're a weakness because of "preferably only one obvious way to do it". A project sticking exclusively (if possible) to concurrent.futures will probably be easier to maintain over the long run, due to the lack of gratuitous novelty in how its minimal API can be used.

这篇关于Python 3 中的 Concurrent.futures 与多处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆