Lambda 触发器的 SQS 批处理无法按预期工作 [英] SQS batching for Lambda trigger doesn't work as expected

查看:17
本文介绍了Lambda 触发器的 SQS 批处理无法按预期工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 2 个 Lambda 函数和一个 SQS 队列.第一个 Lambda 将消息发送到队列.然后第二个 Lambda 对这个 Queue 有一个触发器,批处理大小为 250,批处理窗口为 65 秒.

I have 2 Lambda Functions and an SQS queue inbetween. The first Lambda sends the messages to the Queue. Then second Lambda has a trigger for this Queue with a batch size of 250 and a batch window of 65 seconds.

我希望大约每 65 秒后以 250 条消息为一组触发第二个 Lambda.在第二个 Lambda 中,我调用了一个 3rd 方 API,该 API 限制为每分钟 250 个 API 调用(我每分钟获得 250 个令牌).

I expect the second Lambda to be triggered in batches of 250 messages after about every 65 seconds. In the second Lambda I'm calling a 3rd party API that is limited to 250 API calls per minute (I get 250 tokens per minute).

我测试了这个设置,将 32.000 条消息添加到队列中,第二个 Lambda 没有按预期批量提取消息.起初它被执行了 15k 条消息,然后没有足够的令牌,所以它没有处理这些消息.

I tested this setup with for 32.000 messages being added to the queue and the second Lambda didn't pick up the messages in batches as expected. At first it got executed for 15k messages and then there were not enough tokens so it did not process those messages.

第 3 方 API 基于一个令牌桶,填充率为每分钟 250 个,最大容量为 15.000.由于存储桶容量的原因,它设法处理了前 15.000 条消息,然后没有足够的容量来处理其余的消息.

The 3rd party API is based on a token bucket with a fill rate of 250 per minute and a maximum capacity of 15.000. It managed to process the first 15.000 messages due to the bucket capacity and then didn't have enough capacity to handle the rest.

我不明白出了什么问题.

I don't understand what went wrong.

推荐答案

误解可能与 Lambda 如何处理缩放有关.每当事件超过单个 Lambda 执行上下文/实例可以处理的数量时,Lambda 只会创建更多执行上下文/实例来处理这些事件.

The misunderstanding is probably related to how Lambda handles scaling. Whenever there are more events than a single Lambda execution context/instance can handle, Lambda just creates more execution contexts/instances to process these events.

可能发生的情况是,Lambda 看到队列中有一堆消息,并尝试尽快处理这些消息.它创建了一个 Lambda 实例来处理第一个事件,然后与 SQS 对话并要求做更多的工作.当它收到下一批消息时,第一个实例仍然很忙,所以它向外扩展并创建了第二个实例,该实例并行处理第二批消息,依此类推

What probably happened is that Lambda saw there are a bunch of messages in the queue and it tries to work on these as fast as possible. It created a Lambda instance to handle the first event and then talked to SQS and asked for more work. When it got the next batch of messages, the first instance was still busy, so it scaled out and created a second one that worked on the second batch in parallel, etc. etc.

这就是您在几分钟内完成代币预算的方式.

That's how you ended up going through your token budget in a few minutes.

您可以使用保留并发限制允许 Lambda 并行执行的函数数量 - 这里是 文档 供参考.如果将保留并发设置为 1,则不会进行并行化,并且只允许一个 Lambda 处理消息.

You can limit how many functions Lambda is allowed to execute in parallel by using reserved concurrency - here are the docs for reference. If you set the reserved concurrency to 1, there will be no parallelization and only one Lambda is allowed to work on the messages.

然而,这会让您面临另一个问题.如果单个 Lambda 处理消息的时间少于 60 秒,Lambda 将尽快再次调用另一个批次,您可能会再次超出预算.

This however opens you up to another issue. If that single Lambda takes less than 60 seconds to process the messages, Lambda will call it again with another batch ASAP and you might go over your budget again.

此时,一个相对简单的方法是通过在最后为剩余时间添加睡眠来确保您的 lambda 函数始终需要大约 60 秒.

At this point a relatively simple approach would be to make sure that your lambda function always takes about 60 seconds by adding a sleep for the remaining time at the end.

这篇关于Lambda 触发器的 SQS 批处理无法按预期工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆