Kafka - 消息排序保证 [英] Kafka - Message Ordering Guarantees
问题描述
我遇到了两个关于排序的短语,
<块引用>- 生产者发送到特定主题分区的消息将是按发送顺序附加.也就是说,如果发送一条记录 M1由与记录 M2 相同的生产者,首先发送 M1,然后发送 M1将具有比 M2 更低的偏移量,并在日志中更早出现.
另一个
<块引用>- (config param) max.in.flight.requests.per.connection - 最大数量客户端将在单个连接上发送的未确认请求在阻止之前.注意,如果这个设置设置为大于1 并且有发送失败,存在消息重新排序的风险由于重试(即,如果启用了重试).
问题是,如果像 #2 那样发送失败,订单是否仍会保留到特定分区?如果一条消息存在潜在问题,则每个分区将删除以下所有消息以保留顺序",或者将发送正确"消息并将失败的消息通知给应用程序?
如果像 #2 那样发送失败,订单仍会保留到特定分区吗?"
如您复制的文档部分所述,存在顺序更改的风险.
想象一下,您有一个主题,例如一个分区.您将 retries
设置为 100,将 max.in.flight.requests.per.connection
设置为大于 1 的 5.请注意,只有将 acks
设置为 1 或all"时,重试才有意义.
如果您打算按照 K1、K2、K3、K4、K5 的顺序生成以下消息,并且您的生产者需要一些时间来
- 实际创建批处理和
- 向经纪人提出请求并
- 等待经纪人确认
您最多可以有 5 个并行请求(基于 max.in.flight.request.per.connection
的设置).现在,生产K3"有一些问题,它进入重试循环,可以生成消息 K4 和 K5,因为请求已经在进行中.
您的主题将以以下顺序收到消息:K1、K2、K4、K5、K3.
如果您在 Kafka Producer 中启用了幂等性,仍然可以保证排序,如使用幂等Kafka Producer时的顺序保证
I come across two phrases with respect to ordering,
- Messages sent by a producer to a particular topic partition will be appended in the order they are sent. That is, if a record M1 is sent by the same producer as a record M2, and M1 is sent first, then M1 will have a lower offset than M2 and appear earlier in the log.
Another
- (config param) max.in.flight.requests.per.connection - The maximum number of unacknowledged requests the client will send on a single connection before blocking. Note that if this setting is set to be greater than 1 and there are failed sends, there is a risk of message re-ordering due to retries (i.e., if retries are enabled).
The question is, will the order still be retained to a particular partition if there are failed sends like mentioned #2 ? if there is a potential issue with one message , all the following messages will be dropped "to retain the order" per partition or the "correct" messages will be sent and failed messages will be notified to the application ?
"will the order still be retained to a particular partition if there are failed sends like mentioned #2?"
As written in the documentation part you have copied, there is a risk that the ordering is changed.
Imagine, you have a topic with e.g. one partition. You set the retries
to 100 and the max.in.flight.requests.per.connection
to 5 which is greater than one. As a note, retries will only make sense if you set the acks
to 1 or "all".
If you plan to produce the following messages in the order K1, K2, K3, K4, K5 and it takes your producer some time to
- actually create the batch and
- make a request to the broker and
- wait for the acknowledgement of the broker
you could have up to 5 requests in parallel (based on the setting of max.in.flight.request.per.connection
). Now, producing "K3" has some issues and it goes into the retry-loop, the messages K4 and K5 can be produced as the request was already in flight.
Your topic would end up with messages in that order: K1, K2, K4, K5, K3.
In case you enable idempotency in the Kafka Producer, the ordering would still be guaranteed as explained in Ordering guarantees when using idempotent Kafka Producer
这篇关于Kafka - 消息排序保证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!