aiohttp:速率限制并行请求 [英] aiohttp: rate limiting parallel requests

查看:298
本文介绍了aiohttp:速率限制并行请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

API通常具有用户必须遵循的速率限制.例如,让我们接受50个请求/秒.顺序请求需要0.5-1秒,因此太慢了,无法接近该限制.但是,使用aiohttp的并行请求超出了速率限制.

APIs often have rate limits that users have to follow. As an example let's take 50 requests/second. Sequential requests take 0.5-1 second and thus are too slow to come close to that limit. Parallel requests with aiohttp, however, exceed the rate limit.

要尽可能快地轮询API,需要对并行调用进行速率限制.

To poll the API as fast as allowed, one needs to rate limit parallel calls.

到目前为止,我发现的示例装饰了session.get,大致类似于:

Examples that I found so far decorate session.get, approximately like so:

session.get = rate_limited(max_calls_per_second)(session.get)

这对于顺序调用非常有效.尝试在并行调用中实现此操作无法按预期进行.

This works well for sequential calls. Trying to implement this in parallel calls does not work as intended.

下面是一些代码示例:

async with aiohttp.ClientSession() as session:
    session.get = rate_limited(max_calls_per_second)(session.get)
    tasks = (asyncio.ensure_future(download_coroutine(  
          timeout, session, url)) for url in urls)
    process_responses_function(await asyncio.gather(*tasks))

问题在于它会限制任务的排队. gather的执行将或多或少地同时发生.两全其美;-).

The problem with this is that it will rate-limit the queueing of the tasks. The execution with gather will still happen more or less at the same time. Worst of both worlds ;-).

是的,我在这里找到了类似的问题 aiohttp:设置最大值每秒的请求数,但没有答复回答限制请求率的实际问题.另外,来自昆汀·普拉德(Quentin Pradet)的博客帖子仅适用于对队列进行速率限制.

Yes, I found a similar question right here aiohttp: set maximum number of requests per second, but neither replies answer the actual question of limiting the rate of requests. Also the blog post from Quentin Pradet works only on rate-limiting the queueing.

总结:如何限制并行aiohttp请求的每秒每秒请求数量?

To wrap it up: How can one limit the number of requests per second for parallel aiohttp requests?

推荐答案

如果我很了解您,是否要限制同时请求的数量?

If I understand you well, you want to limit the number of simultaneous requests?

asyncio中有一个名为Semaphore的对象,它的工作方式类似于异步的RLock.

There is a object inside asyncio named Semaphore, it works like an asynchronous RLock.

semaphore = asyncio.Semaphore(50)
#...
async def limit_wrap(url):
    async with semaphore:
        # do what you want
#...
results = asyncio.gather([limit_wrap(url) for url in urls])

已更新

假设我发出了50个并发请求,它们都在2秒内完成.因此,它没有达到限制(每秒仅25个请求).

updated

Suppose I make 50 concurrent requests, and they all finish in 2 seconds. So, it doesn't touch the limitation(only 25 requests per seconds).

这意味着我应该发出100个并发请求,它们也都在2秒内完成(每秒50个请求).但是在您实际提出这些请求之前,您如何确定它们将完成多长时间?

That means I should make 100 concurrent requests, and they all finish in 2 seconds too(50 requests per seconds). But before you actually make those requests, how could you determine how long will they finish?

或者,如果您不介意每秒完成的请求,但每秒完成的请求.您可以:

Or if you doesn't mind finished requests per second but requests made per second. You can:

async def loop_wrap(urls):
    for url in urls:
        asyncio.ensure_future(download(url))
        await asyncio.sleep(1/50)

asyncio.ensure_future(loop_wrap(urls))
loop.run_forever()

上面的代码将每1/50秒创建一个Future实例.

The code above will create a Future instance every 1/50 second.

这篇关于aiohttp:速率限制并行请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆