使用 Python 3.7+ 进行 100k API 调用,使用 asyncio 并行进行 100 [英] Using Python 3.7+ to make 100k API calls, making 100 in parallel using asyncio
问题描述
使用 asyncio async/await 和 Python 3.7+ 提供 100k API 调用的最佳方法是什么?想法是始终并行使用 100 个任务?
What is the best approach to deliver say 100k API calls using asyncio async/await with Python 3.7+ The idea is to use 100 tasks in parallel all the time?
应该避免的是:
1. 开始处理所有 10 万个任务
2. 等待所有 100 个并行任务完成,以便安排新的 100 个任务.
What should be avoided is:
1. To start working on all 100k tasks
2. To wait for all 100 parallel tasks to finish so new batch of 100 is scheduled.
这个例子说明了第一种方法,这不是所需要的.
This example illustrates the first approach, that is not what is needed.
import aiohttp
import asyncio
async def fetch(session, url):
async with session.get(url) as response:
return await response.text()
async def main():
urls = [
'http://python.org',
'https://google.com',
'http://yifei.me'
]
tasks = []
async with aiohttp.ClientSession() as session:
for url in urls:
tasks.append(fetch(session, url))
htmls = await asyncio.gather(*tasks)
for html in htmls:
print(html[:100])
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
推荐答案
使用 semaphore
.信号量用于限制并发操作.Python 的 asyncio 带有自己的异步信号量版本.
Use semaphore
. Semaphores are used to limit concurrent actions. Python's asyncio comes with its own async version of semaphore.
import aiohttp
import asyncio
async def fetch(session, url, sema):
async with sema, session.get(url) as response:
return await response.text()
async def main():
urls = [
'http://python.org',
'https://google.com',
'http://yifei.me',
'other urls...'
]
tasks = []
sema = asyncio.BoundedSemaphore(value=100)
async with aiohttp.ClientSession() as session:
for url in urls:
tasks.append(fetch(session, url, sema))
htmls = await asyncio.gather(*tasks)
for html in htmls:
print(html[:100])
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
这篇关于使用 Python 3.7+ 进行 100k API 调用,使用 asyncio 并行进行 100的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!