python中multiprocessing、asyncio和concurrency.futures的区别 [英] Difference between multiprocessing, asyncio and concurrency.futures in python

查看:199
本文介绍了python中multiprocessing、asyncio和concurrency.futures的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为使用并发的新手,我对何时使用不同的 Python 并发库感到困惑.据我了解,多处理、多线程和异步编程是并发的一部分,而多处理是称为并行的并发子集的一部分.

Being new to using concurrency, I am confused about when to use the different python concurrency libraries. To my understanding, multiprocessing, multithreading and asynchronous programming are part of concurrency, while multiprocessing is part of a subset of concurrency called parallelism.

我在网上搜索了有关在 python 中处理并发的不同方法,我遇到了多处理库、concurrenct.futures 的 ProcessPoolExecutor() 和 ThreadPoolExecutor() 以及 asyncio.让我困惑的是这些库之间的区别.特别是 multiprocessing 库的作用,因为它有像 pool.apply_async 这样的方法,它是否也做 asyncio 的工作?如果是这样,当它是一种与 asyncio 实现并发的不同方法(多进程 vs 协作多任务)时,为什么将其称为多处理?

I searched around on the web about different ways to approach concurrency in python, and I came across the multiprocessing library, concurrenct.futures' ProcessPoolExecutor() and ThreadPoolExecutor(), and asyncio. What confuses me is the difference between these libraries. Especially what the multiprocessing library does, since it has methods like pool.apply_async, does it also do the job of asyncio? If so, why is it called multiprocessing when it is a different method to achieve concurrency from asyncio (multiple processes vs cooperative multitasking)?

推荐答案

有几个不同的库在起作用:

There are several different libraries at play:

  • threading:操作系统级线程的接口.请注意,CPU 密集型工作主要由 GIL 序列化,所以不要指望你的计算.当您需要并行调用阻塞 API 时使用它,特别是当您需要控制线程创建时.避免创建太多线程,因为它们很昂贵.

  • threading: interface to OS-level threads. Note that CPU-bound work is mostly serialized by the GIL, so don't expect speedup in your calculations. Use it when you need to invoke blocking APIs in parallel, in particular when you need control over thread creation. Avoid creating too many threads, as they are expensive.

multiprocessing:用于生成多个 python 进程的接口,其 API 有意类似于 threading.多个进程并行工作,因此您实际上可以使用此方法加快计算速度.缺点是你不能在不使用多处理特定的工具.

multiprocessing: interface to spawning multiple python processes with an API intentionally similar to threading. Multiple processes work in parallel, so you can actually speed up calculations using this method. The disadvantage is that you can't share in-memory datastructures without using multi-processing specific tools.

concurrent.futures:一个更现代的threadingmultiprocessing 接口,它提供了方便的线程/进程池,它调用执行器.池的主要入口点是 submit 方法返回一个 handle 您可以测试是否完成或等待其结果.获取结果会为您提供提交函数的返回值并正确传播引发的异常(如果有),这对于 threading 来说会很乏味.在考虑基于线程或进程的并行性时,concurrent.futures 应该是首选工具.

concurrent.futures: A more modern interface to threading and multiprocessing, which provides convenient thread/process pools it calls executors. The pool's main entry point is the submit method which returns a handle that you can test for completion or wait for its result. Getting the result gives you the return value of the submitted function and correctly propagates raised exceptions (if any), which would be tedious to do with threading. concurrent.futures should be the tool of choice when considering thread or process based parallelism.

asyncio:虽然前面的选项是async"从某种意义上说,它们提供非阻塞 API(这就是 apply_async 和其他人参考),他们仍然依靠线程/进程池来发挥他们的魔力,并不能真正并行做比他们更多的事情池中有工人.Asyncio 全面使用单线程执行和异步系统调用.它根本没有阻塞调用,唯一的阻塞部分是 asyncio.run() 入口点.Asyncio 代码通常使用协程编写,协程使用 await 暂停,直到发生有趣的事情.(挂起与阻塞的不同之处在于它允许事件循环线程在您等待时继续执行其他操作.)与基于线程的解决方案相比,它具有许多优势,例如能够生成数以千计的廉价任务".不会使系统陷入困境,并且能够取消任务或轻松地同时等待多个事情.Asyncio 应该是服务器和连接到多个服务器的客户端的首选工具.

asyncio: While the previous options are "async" in the sense that they provide non-blocking APIs (this is what apply_async and others refer to), they are still relying on thread/process pools to do their magic, and cannot really do more things in parallel than they have workers in the pool. Asyncio uses a single thread of execution and async system calls across the board. It has no blocking calls at all, the only blocking part being the asyncio.run() entry point. Asyncio code is typically written using coroutines, which use await to suspend until something interesting happens. (Suspending is different than blocking in that it allows the event loop thread to continue to other things while you're waiting.) It has many advantages compared to thread-based solutions, such as being able to spawn thousands of cheap "tasks" without bogging down the system, and being able to cancel tasks or easily wait for multiple things at once. Asyncio should be the tool of choice for servers and for clients connecting to multiple servers.

在 asyncio 和多​​线程/多处理之间进行选择时,请考虑线程用于并行工作,而异步用于并行等待"的格言.

When choosing between asyncio and multithreading/multiprocessing, consider the adage that "threading is for working in parallel, and async is for waiting in parallel".

另请注意,asyncio 可以等待函数 concurrent.futures 提供的线程或进程池中执行,因此它可以作为所有这些不同模型之间的粘合剂.这就是 asyncio 经常被用来构建新的图书馆基础设施的部分原因.

Also note that asyncio can await functions executed in thread or process pools provided by concurrent.futures, so it can serve as glue between all those different models. This is part of the reason why asyncio is often used to build new library infrastructure.

这篇关于python中multiprocessing、asyncio和concurrency.futures的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆