Kinesis使用者getRecords不能始终返回10,000条记录 [英] Kinesis consumer getRecords does not return 10,000 records consistently

查看:85
本文介绍了Kinesis使用者getRecords不能始终返回10,000条记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个运动学流(20个碎片),大约有1天的数据滞后,这是基于KCL的运动学消费者所消耗的.使用者部署了20个ECS实例,因此每个实例都有一个线程从每个分片中提取数据.

I have a kinesis stream (20 shards) with about a 1 day data lag that is being consumed by a KCL based kinesis consumer. The consumer is deployed with 20 ECS instances, so each instance has a thread pulling data from each shard.

根据文档,看起来一个单独的getRecords调用最多可以获取10,000条记录或最大有效负载大小为10 MB.但是,当我监视使用者日志时,似乎并非所有碎片都达到此限制.使用单个getRecords调用获取的记录在使用者实例之间非常不一致.一些调用获取大约100-400条记录,而某些调用获取大约4000-5000条记录.在极少数情况下,某些调用会提取9999条记录.结果,数据延迟不会得到减少.

Based on documentation, it looks like a single getRecords call can fetch upto 10,000 records or a maximum payload size of 10 MB. However, when I monitor the consumer logs, not all shards seem to reach this limit. The records fetched with a single getRecords call is very inconsistent across the consumer instances. Some calls fetch around 100-400 records, while some calls fetch around 4000-5000 records. On rare occasions, some calls fetch 9999 records. As a result, the data lag is not getting reduced.

使用者大约需要5分钟才能处理10,000条记录,因此也无法达到读取吞吐量.

The consumer takes around 5 minutes to process 10,000 records so the read throughput is not being reached as well.

是否有对此的解释或可以研究的指标,以便进一步调试此问题?

Is there an explanation for this or metrics that I could look into, to debug this issue further?

推荐答案

取决于您的记录大小,这可能是由于以下

Depending on your record sizes, this might be because of the following Kinesis service limit:

每个分片通过GetRecords最多可以支持每秒2 MB的最大总数据读取速率.如果对GetRecords的调用返回10 MB,则在接下来的5秒内进行的后续调用将引发异常.

Each shard can support up to a maximum total data read rate of 2 MB per second via GetRecords. If a call to GetRecords returns 10 MB, subsequent calls made within the next 5 seconds throw an exception.

如果这确实是您遇到的限制,则可能要考虑添加更多分片.

You might want to consider adding more shards if this is indeed the limit you are running into.

这篇关于Kinesis使用者getRecords不能始终返回10,000条记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆