aiohttp:限制并行请求的速率 [英] aiohttp: rate limiting parallel requests

查看:79
本文介绍了aiohttp:限制并行请求的速率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

API 通常有用户必须遵守的速率限制.例如,让我们以 50 个请求/秒为例.顺序请求需要 0.5-1 秒,因此太慢而无法接近该限制.但是,使用 aiohttp 的并行请求超出了速率限制.

APIs often have rate limits that users have to follow. As an example let's take 50 requests/second. Sequential requests take 0.5-1 second and thus are too slow to come close to that limit. Parallel requests with aiohttp, however, exceed the rate limit.

为了尽可能快地轮询 API,需要限制并行调用的速率.

To poll the API as fast as allowed, one needs to rate limit parallel calls.

我目前发现的装饰 session.get 的示例,大致如下:

Examples that I found so far decorate session.get, approximately like so:

session.get = rate_limited(max_calls_per_second)(session.get)

这适用于顺序调用.尝试在并行调用中实现这一点无法按预期工作.

This works well for sequential calls. Trying to implement this in parallel calls does not work as intended.

以下是一些代码示例:

async with aiohttp.ClientSession() as session:
    session.get = rate_limited(max_calls_per_second)(session.get)
    tasks = (asyncio.ensure_future(download_coroutine(  
          timeout, session, url)) for url in urls)
    process_responses_function(await asyncio.gather(*tasks))

问题在于它会限制任务的排队.gather 的执行仍会或多或少同时发生.两全其美;-)

The problem with this is that it will rate-limit the queueing of the tasks. The execution with gather will still happen more or less at the same time. Worst of both worlds ;-).

是的,我在这里发现了一个类似的问题 aiohttp:设置最大值每秒请求数,但都没有回答限制请求速率的实际问题.还有 来自 Quentin Pradet 的博文 仅适用于限制排队的速率.

Yes, I found a similar question right here aiohttp: set maximum number of requests per second, but neither replies answer the actual question of limiting the rate of requests. Also the blog post from Quentin Pradet works only on rate-limiting the queueing.

总结一下:如何限制并行 aiohttp 请求的每秒请求数?

To wrap it up: How can one limit the number of requests per second for parallel aiohttp requests?

推荐答案

如果我理解你,你想限制同时请求的数量?

If I understand you well, you want to limit the number of simultaneous requests?

asyncio 中有一个名为Semaphore 的对象,它的工作原理类似于异步RLock.

There is a object inside asyncio named Semaphore, it works like an asynchronous RLock.

semaphore = asyncio.Semaphore(50)
#...
async def limit_wrap(url):
    async with semaphore:
        # do what you want
#...
results = asyncio.gather([limit_wrap(url) for url in urls])

更新

假设我发出 50 个并发请求,它们都在 2 秒内完成.所以,它没有触及限制(每秒只有 25 个请求).

updated

Suppose I make 50 concurrent requests, and they all finish in 2 seconds. So, it doesn't touch the limitation(only 25 requests per seconds).

这意味着我应该发出 100 个并发请求,并且它们也都在 2 秒内完成(每秒 50 个请求).但是在您真正提出这些请求之前,您如何确定它们将完成多长时间?

That means I should make 100 concurrent requests, and they all finish in 2 seconds too(50 requests per seconds). But before you actually make those requests, how could you determine how long will they finish?

或者如果您不介意每秒完成的请求数每秒发出的请求数.您可以:

Or if you doesn't mind finished requests per second but requests made per second. You can:

async def loop_wrap(urls):
    for url in urls:
        asyncio.ensure_future(download(url))
        await asyncio.sleep(1/50)

asyncio.ensure_future(loop_wrap(urls))
loop.run_forever()

上面的代码将每 1/50 秒创建一个 Future 实例.

The code above will create a Future instance every 1/50 second.

这篇关于aiohttp:限制并行请求的速率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆