为什么asyncio不总是使用执行程序? [英] Why doesn't asyncio always use executors?

查看:89
本文介绍了为什么asyncio不总是使用执行程序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须发送很多HTTP请求,一旦所有HTTP请求返回,程序就可以继续。听起来像 asyncio 的绝配。天真地,我将对 requests 的调用包装在 async 函数中,然后将它们传递给异步。这行不通。

I have to send a lot of HTTP requests, once all of them have returned, the program can continue. Sounds like a perfect match for asyncio. A bit naively, I wrapped my calls to requests in an async function and gave them to asyncio. This doesn't work.

在线搜索后,我发现了两种解决方案:

After searching online, I found two solutions:


  • 使用类似 aiohttp ,可与 asyncio

  • 在调用 run_in_executor

  • use a library like aiohttp, which is made to work with asyncio
  • wrap the blocking code in a call to run_in_executor

为了更好地理解这一点,我编写了一个小型基准。服务器端是一个Flask程序,在等待请求之前等待0.1秒。

To understand this better, I wrote a small benchmark. The server-side is a flask program that waits 0.1 seconds before answering a request.

from flask import Flask
import time

app = Flask(__name__)


@app.route('/')
def hello_world():
    time.sleep(0.1) // heavy calculations here :)
    return 'Hello World!'


if __name__ == '__main__':
    app.run()

客户是我的基准

import requests
from time import perf_counter, sleep

# this is the baseline, sequential calls to requests.get
start = perf_counter()
for i in range(10):
    r = requests.get("http://127.0.0.1:5000/")
stop = perf_counter()
print(f"synchronous took {stop-start} seconds") # 1.062 secs

# now the naive asyncio version
import asyncio
loop = asyncio.get_event_loop()

async def get_response():
    r = requests.get("http://127.0.0.1:5000/")

start = perf_counter()
loop.run_until_complete(asyncio.gather(*[get_response() for i in range(10)]))
stop = perf_counter()
print(f"asynchronous took {stop-start} seconds") # 1.049 secs

# the fast asyncio version
start = perf_counter()
loop.run_until_complete(asyncio.gather(
    *[loop.run_in_executor(None, requests.get, 'http://127.0.0.1:5000/') for i in range(10)]))
stop = perf_counter()
print(f"asynchronous (executor) took {stop-start} seconds") # 0.122 secs

#finally, aiohttp
import aiohttp

async def get_response(session):
    async with session.get("http://127.0.0.1:5000/") as response:
        return await response.text()

async def main():
    async with aiohttp.ClientSession() as session:
        await get_response(session)

start = perf_counter()
loop.run_until_complete(asyncio.gather(*[main() for i in range(10)]))
stop = perf_counter()
print(f"aiohttp took {stop-start} seconds") # 0.121 secs

因此, asyncio 不处理阻止io代码。但是,如果正确使用 asyncio ,它与特殊的 aiohttp 框架一样快。 协同程序和任务的文档并没有真正提到这一点。仅当您阅读 loop.run_in_executor()时, ,它表示:

So, an intuitive implementation with asyncio doesn't deal with blocking io code. But if you use asyncio correctly, it is just as fast as the special aiohttp framework. The docs for coroutines and tasks don't really mention this. Only if you read up on the loop.run_in_executor(), it says:


# File operations (such as logging) can block the
# event loop: run them in a thread pool.


我对此行为感到惊讶。异步的目的是加快阻塞io调用。为什么需要额外的包装程序 run_in_executor

I was surprised by this behaviour. The purpose of asyncio is to speed up blocking io calls. Why is an additional wrapper, run_in_executor, necessary to do this?

aiohttp 的整个卖点似乎是对 asyncio 。但据我所见, requests 模块可以完美地工作-只要将其包装在执行程序中即可。有理由避免在执行程序中包装一些东西吗?

The whole selling point of aiohttp seems to be support for asyncio. But as far as I can see, the requests module works perfectly - as long as you wrap it in an executor. Is there a reason to avoid wrapping something in an executor ?

推荐答案


但据我所知看到,requests模块运行得很好-只要将
包裹在执行程序中,它就可以正常工作。有理由避免在执行器中包装
吗?

But as far as I can see, the requests module works perfectly - as long as you wrap it in an executor. Is there a reason to avoid wrapping something in an executor ?

在执行器中运行代码意味着要在操作系统线程。

Running code in executor means to run it in OS threads.

aiohttp 和类似的库允许仅使用协程在没有OS线程的情况下运行非阻塞代码。

aiohttp and similar libraries allow to run non-blocking code without OS threads, using coroutines only.

有很多工作,操作系统线程和协程之间的差异并不明显,尤其是与瓶颈-I / O操作相比。但是一旦您做了很多工作,您会注意到由于上下文切换昂贵,操作系统线程的性能相对较差。

If you don't have much work, difference between OS threads and coroutines is not significant especially comparing to bottleneck - I/O operations. But once you have much work you can notice that OS threads perform relatively worse due to expensively context switching.

例如,当我将您的代码更改为 time.sleep(0.001) range(100),我的机器显示:

For example, when I change your code to time.sleep(0.001) and range(100), my machine shows:

asynchronous (executor) took 0.21461606299999997 seconds
aiohttp took 0.12484742700000007 seconds

这种差异只会根据请求数而增加。

And this difference will only increase according to number of requests.


asyncio的目的是加快阻止io调用。

The purpose of asyncio is to speed up blocking io calls.

不是, asyncio 的目的是提供方便的方法来控制执行流程。 asyncio 允许您根据协程和OS线程(使用执行程序时)或基于纯协程(如 aiohttp 确实)。

Nope, purpose of asyncio is to provide convenient way to control execution flow. asyncio allows you to choose how flow works - based on coroutines and OS threads (when you use executor) or on pure coroutines (like aiohttp does).

这是 aiohttp 的目的,是为了加快工作速度并应对任务如上所示:)

It's aiohttp's purpose to speed up things and it copes with the task as shown above :)

这篇关于为什么asyncio不总是使用执行程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆