如何使用 lambda 函数处理 SQS 队列(不是通过计划事件)? [英] How to process SQS queue with lambda function (not via scheduled events)?

查看:44
本文介绍了如何使用 lambda 函数处理 SQS 队列(不是通过计划事件)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我正在努力实现的简化方案:

Here is the simplified scheme I am trying to make work:

http 请求 --> (Gateway API + lambda A) --> SQS --> (lambda B?????) --> DynamoDB

http requests --> (Gateway API + lambda A) --> SQS --> (lambda B ?????) --> DynamoDB

所以它应该如图所示工作:来自许多 http 请求的数据(例如,每秒最多 500 个)由我的 lambda 函数 A 放入 SQS 队列.然后另一个函数 B 处理队列:读取多达 10 个项目(定期)并使用 BatchWriteItem 将它们写入 DynamoDB.

So it should work as shown: data coming from many http requests (up to 500 per second, for example) is placed into SQS queue by my lambda function A. Then the other function, B, processes the queue: reads up to 10 items (on some periodical basis) and writes them to DynamoDB with BatchWriteItem.

问题是我不知道如何触发第二个 lambda 函数.它应该被频繁调用,每秒多次(或至少每秒一次),因为我需要队列中的所有数据尽快进入 DynamoDB(这就是为什么通过计划事件调用 lambda 函数 B,如 此处 不是选项)

The problem is that I can't figure out how to trigger the second lambda function. It should be called frequently, multiple times per second (or at least once per second), because I need all the data from the queue to get into DynamoDB ASAP (that's why calling lambda function B via scheduled events as described here is not a option)

为什么我不想直接写入 DynamoDB,没有 SQS?

如果我完全避免使用 SQS,那就太好了.我试图用 SQS 解决的问题是 DynamoDB 限制.甚至不是限制本身,而是使用 AWS SDK 将数据写入 DynamoDB 时的处理方式:当一条一条的写入记录并对其进行限制时,AWS SDK 会默默地重试写入,导致从 http 客户端的角度增加请求处理时间查看.

That would be great for me to avoid using SQS at all. The problem that I am trying to address with SQS is DynamoDB throttling. Not even throttling itself but the way it is handled while writing data to DynamoDB with AWS SDK: when writing records one by one and getting them throttled, AWS SDK silently retries writing, resulting in increasing of the request processing time from the http client's point of view.

所以我想将数据临时存储在队列中,将响应200 OK"发送回客户端,然后由单独的函数处理队列,使用一个 DynamoDB 的 BatchWriteItem 调用写入多条记录(其中在限制的情况下返回未处理的项目而不是自动重试).我什至更愿意丢失一些记录,而不是增加接收记录和存储在 DynamoDB 之间的延迟

So I would like to temporarily store data in the queue, send response "200 OK" back to client, and then get queue processed by separate function, writing multiple records with one DynamoDB's BatchWriteItem call (which returns Unprocessed items instead of automatic retry in case of throttling). I would even prefer to lose some records instead of increasing the lag between a record being received and stored in DynamoDB

UPD:如果有人感兴趣,我已经找到了如何让 aws-sdk 在节流的情况下跳过自动重试:有一个特殊参数 maxRetries.无论如何,按照下面的建议使用 Kinesis

UPD: If anyone is interested, I have found how to make aws-sdk skip automatic retries in case of throttling: there is a special parameter maxRetries. Anyway, going to use Kinesis as suggested below

推荐答案

[这不会直接回答你明确的问题,所以根据我的经验,它会被否决:) 但是,我会回答你试图解决的基本问题解决.]

[This doesn't directly answer your explicit question, so in my experience it will be downvoted :) However, I will answer the fundamental problem you are trying to solve.]

我们处理大量传入请求并将它们提供给 AWS Lambda 函数以便以有节奏的方式写入 DynamoDB 的方式是将建议架构中的 SQS 替换为 Amazon Kinesis 流.

The way we take a flood of incoming requests and feed them to AWS Lambda functions for writing in a paced manner to DynamoDB is to replace SQS in the proposed architecture with Amazon Kinesis streams.

Kinesis 流可以驱动 AWS Lambda 函数.

Kinesis streams can drive AWS Lambda functions.

Kinesis 流保证任何给定键的传递消息的顺序(适用于有序的数据库操作).

Kinesis streams guarantee ordering of the delivered messages for any given key (nice for ordered database operations).

Kinesis 流让您可以指定可以并行运行的 AWS Lambda 函数的数量(每个分区一个),这可以与您的 DynamoDB 写入容量相协调.

Kinesis streams let you specify how many AWS Lambda functions can be run in parallel (one per partition), which can be coordinated with your DynamoDB write capacity.

Kinesis 流可以在一次 AWS Lambda 函数调用中传递多条可用消息,从而实现进一步优化.

Kinesis streams can pass multiple available messages in one AWS Lambda function invocation, allowing for further optimization.

注意:实际上是 AWS Lambda 服务从 Amazon Kinesis 流中读取然后调用函数,而不是 Kinesis 流直接调用 AWS Lambda;但有时更容易想象 Kinesis 驱动它.对用户的结果几乎相同.

Note: It's really the AWS Lambda service that reads from Amazon Kinesis streams then invokes the function, and not Kinesis streams directly invoking AWS Lambda; but sometimes it's easier to visualize as Kinesis driving it. The result to the user is nearly the same.

这篇关于如何使用 lambda 函数处理 SQS 队列(不是通过计划事件)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆