为什么 asyncio 不总是使用执行程序? [英] Why doesn't asyncio always use executors?

查看:30
本文介绍了为什么 asyncio 不总是使用执行程序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须发送很多 HTTP 请求,一旦所有请求都返回,程序就可以继续.听起来很适合 asyncio.有点天真,我把对 requests 的调用封装在一个 async 函数中,并将它们交给 asyncio.这不起作用.

在网上搜索后,我找到了两个解决方案:

  • 使用诸如 aiohttp 之类的库,该库可与 asyncio<一起使用/code>
  • 将阻塞代码包装在对run_in_executor的调用中

为了更好地理解这一点,我写了一个小基准.服务器端是一个flask程序,在响应请求之前等待0.1秒.

from flask import Flask导入时间app = Flask(__name__)@app.route('/')def hello_world():time.sleep(0.1)//这里有繁重的计算 :)返回你好世界!"如果 __name__ == '__main__':应用程序运行()

客户是我的标杆

导入请求从时间导入 perf_counter,睡眠# 这是对 requests.get 的基线顺序调用开始 = perf_counter()对于范围内的我(10):r = requests.get("http://127.0.0.1:5000/")停止 = perf_counter()打印(f同步花了{停止启动}秒")#1.062秒# 现在是简单的 asyncio 版本导入异步loop = asyncio.get_event_loop()异步定义 get_response():r = requests.get("http://127.0.0.1:5000/")开始 = perf_counter()loop.run_until_complete(asyncio.gather(*[get_response() for i in range(10)]))停止 = perf_counter()打印(f异步花了{停止-开始}秒")#1.049秒# 快速异步版本开始 = perf_counter()loop.run_until_complete(asyncio.gather(*[loop.run_in_executor(None, requests.get, 'http://127.0.0.1:5000/') for i in range(10)]))停止 = perf_counter()打印(f异步(执行程序)花了{停止-启动}秒")#0.122秒#最后,aiohttp导入 aiohttp异步 def get_response(session):与 session.get("http://127.0.0.1:5000/") 异步作为响应:返回等待 response.text()异步定义主():与 aiohttp.ClientSession() 异步作为会话:等待 get_response(会话)开始 = perf_counter()loop.run_until_complete(asyncio.gather(*[main() for i in range(10)]))停止 = perf_counter()打印(faiohttp 花了 {stop-start} 秒")# 0.121 秒

因此,asyncio 的直观实现不会处理阻塞 io 代码.但是如果你正确使用asyncio,它和特殊的aiohttp 框架一样快.协程和任务 的文档并没有真正提到这一点.仅当您阅读 loop.run_in_executor(),它说:

<块引用>

# 文件操作(如日志)可以阻塞# 事件循环:在线程池中运行它们.

我对这种行为感到惊讶.asyncio 的目的是加速阻塞 io 调用.为什么需要额外的包装器 run_in_executor 来执行此操作?

aiohttp 的整个卖点似乎是对 asyncio 的支持.但就我所见,requests 模块工作得很好——只要你把它包装在一个执行器中.是否有理由避免在 executor 中包装某些东西?

解决方案

但据我所知,请求模块运行良好 - 只要当您将其包装在执行程序中时.是否有理由避免包装执行器中的某些东西?

在执行器中运行代码意味着在操作系统线程中运行.

aiohttp 和类似的库允许在没有操作系统线程的情况下运行非阻塞代码,仅使用协程.

如果你没有太多工作,操作系统线程和协程之间的区别并不显着,特别是与瓶颈 - I/O 操作相比.但是一旦您完成了大量工作,您就会注意到操作系统线程的性能相对较差,因为上下文切换.

比如我把你的代码改成time.sleep(0.001)range(100),我的机器显示:

异步(执行器)耗时 0.21461606299999997 秒aiohttp 耗时 0.12484742700000007 秒

而且这种差异只会随着请求数量的增加而增加.

<块引用>

asyncio 的目的是加速阻塞 io 调用.

不,asyncio 的目的是提供方便的方式来控制执行流程.asyncio 允许您选择流程的工作方式 - 基于协程和操作系统线程(当您使用执行程序时)或基于纯协程(如 aiohttp 所做的).

aiohttp 的目的是加快速度,并处理如上所示的任务:)

I have to send a lot of HTTP requests, once all of them have returned, the program can continue. Sounds like a perfect match for asyncio. A bit naively, I wrapped my calls to requests in an async function and gave them to asyncio. This doesn't work.

After searching online, I found two solutions:

  • use a library like aiohttp, which is made to work with asyncio
  • wrap the blocking code in a call to run_in_executor

To understand this better, I wrote a small benchmark. The server-side is a flask program that waits 0.1 seconds before answering a request.

from flask import Flask
import time

app = Flask(__name__)


@app.route('/')
def hello_world():
    time.sleep(0.1) // heavy calculations here :)
    return 'Hello World!'


if __name__ == '__main__':
    app.run()

The client is my benchmark

import requests
from time import perf_counter, sleep

# this is the baseline, sequential calls to requests.get
start = perf_counter()
for i in range(10):
    r = requests.get("http://127.0.0.1:5000/")
stop = perf_counter()
print(f"synchronous took {stop-start} seconds") # 1.062 secs

# now the naive asyncio version
import asyncio
loop = asyncio.get_event_loop()

async def get_response():
    r = requests.get("http://127.0.0.1:5000/")

start = perf_counter()
loop.run_until_complete(asyncio.gather(*[get_response() for i in range(10)]))
stop = perf_counter()
print(f"asynchronous took {stop-start} seconds") # 1.049 secs

# the fast asyncio version
start = perf_counter()
loop.run_until_complete(asyncio.gather(
    *[loop.run_in_executor(None, requests.get, 'http://127.0.0.1:5000/') for i in range(10)]))
stop = perf_counter()
print(f"asynchronous (executor) took {stop-start} seconds") # 0.122 secs

#finally, aiohttp
import aiohttp

async def get_response(session):
    async with session.get("http://127.0.0.1:5000/") as response:
        return await response.text()

async def main():
    async with aiohttp.ClientSession() as session:
        await get_response(session)

start = perf_counter()
loop.run_until_complete(asyncio.gather(*[main() for i in range(10)]))
stop = perf_counter()
print(f"aiohttp took {stop-start} seconds") # 0.121 secs

So, an intuitive implementation with asyncio doesn't deal with blocking io code. But if you use asyncio correctly, it is just as fast as the special aiohttp framework. The docs for coroutines and tasks don't really mention this. Only if you read up on the loop.run_in_executor(), it says:

# File operations (such as logging) can block the
# event loop: run them in a thread pool.

I was surprised by this behaviour. The purpose of asyncio is to speed up blocking io calls. Why is an additional wrapper, run_in_executor, necessary to do this?

The whole selling point of aiohttp seems to be support for asyncio. But as far as I can see, the requests module works perfectly - as long as you wrap it in an executor. Is there a reason to avoid wrapping something in an executor ?

解决方案

But as far as I can see, the requests module works perfectly - as long as you wrap it in an executor. Is there a reason to avoid wrapping something in an executor ?

Running code in executor means to run it in OS threads.

aiohttp and similar libraries allow to run non-blocking code without OS threads, using coroutines only.

If you don't have much work, difference between OS threads and coroutines is not significant especially comparing to bottleneck - I/O operations. But once you have much work you can notice that OS threads perform relatively worse due to expensively context switching.

For example, when I change your code to time.sleep(0.001) and range(100), my machine shows:

asynchronous (executor) took 0.21461606299999997 seconds
aiohttp took 0.12484742700000007 seconds

And this difference will only increase according to number of requests.

The purpose of asyncio is to speed up blocking io calls.

Nope, purpose of asyncio is to provide convenient way to control execution flow. asyncio allows you to choose how flow works - based on coroutines and OS threads (when you use executor) or on pure coroutines (like aiohttp does).

It's aiohttp's purpose to speed up things and it copes with the task as shown above :)

这篇关于为什么 asyncio 不总是使用执行程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆