kafka 是否区分消耗的偏移量和提交的偏移量? [英] Does kafka distinguish between consumed offset and commited offset?

查看:64
本文介绍了kafka 是否区分消耗的偏移量和提交的偏移量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,消费者从特定主题读取消息,消费者客户端将定期提交偏移量.

From what I understand a consumer reads messages off a particular topic, and the consumer client will periodically commit the offset.

因此,如果消费者由于某种原因未能发送特定消息,则不会提交该偏移量,然后您可以返回并重新处理他的消息.

So if for some reason the consumer fails a particular message, that offset won't be committed and you can then go back and reprocess he message.

是否有任何东西可以跟踪您刚刚消耗的偏移量以及您随后提交的偏移量?

Is there anything that tracks the offset you just consumed and the offset you then commit?

推荐答案

kafka 是否区分消耗的偏移量和提交的偏移量?

Does kafka distinguish between consumed offset and commited offset?

是的,有很大的不同.consumed 偏移量由消费者管理,消费者将从主题分区中获取后续消息.

Yes, there is a big difference. The consumed offset is managed by the consumer in such a way that the consumer will fetch subsequent messages out of a topic partition.

消费者可以(但不是必须)自动或通过调用提交 API 提交消息.该信息存储在名为 __consumer_offsets 的 Kafka 内部主题中,并存储基于 ConsumerGroup、Topic 和 Partition 的提交偏移量.如果客户端正在重新启动或新的消费者加入/离开消费者组,它将被使用.

The consumer can (but it is not a must) commit a message either automatically or by calling the commit API. The information is stored in a Kafka internal topic called __consumer_offsets and stores the committed offset based on ConsumerGroup, Topic and Partition. It will be used if the client is getting restartet or a new consumer joins/leaves the ConsumerGroup.

请记住,如果您的客户端没有提交偏移量 n 但后来提交偏移量 n+1,对于 Kafka 来说,情况不会有所不同当您提交两个偏移量时.

Just keep in mind that if your client does not committ offset n but later committs offset n+1, for Kafka it won't make a different to the case when you commit both offsets.

可以在 抵消和消费者定位:

More details on consumed and committed offsets can be found in the JavaDocs of KafkaConsumer on Offsets and Consumer Position:

Kafka 为分区中的每条记录维护一个数字偏移量.该偏移量充当该分区内记录的唯一标识符,并且还表示消费者在该分区中的位置.例如,位于位置 5 的消费者消费了偏移量为 0 到 4 的记录,接下来将接收偏移量为 5 的记录.实际上有两个与消费者的用户相关的位置概念:

Kafka maintains a numerical offset for each record in a partition. This offset acts as a unique identifier of a record within that partition, and also denotes the position of the consumer in the partition. For example, a consumer which is at position 5 has consumed records with offsets 0 through 4 and will next receive the record with offset 5. There are actually two notions of position relevant to the user of the consumer:

消费者的位置给出了下一条将要发出的记录的偏移量.它将比消费者在该分区中看到的最高偏移大一.每次消费者在 poll(Duration) 调用中收到消息时,它都会自动前进.

The position of the consumer gives the offset of the next record that will be given out. It will be one larger than the highest offset the consumer has seen in that partition. It automatically advances every time the consumer receives messages in a call to poll(Duration).

提交位置是安全存储的最后一个偏移量.如果进程失败并重新启动,这是消费者将恢复到的偏移量.消费者可以定期自动提交偏移量;或者它可以选择通过调用其中一个提交 API(例如 commitSync 和 commitAsync)来手动控制此提交位置.

The committed position is the last offset that has been stored securely. Should the process fail and restart, this is the offset that the consumer will recover to. The consumer can either automatically commit offsets periodically; or it can choose to control this committed position manually by calling one of the commit APIs (e.g. commitSync and commitAsync).

这种区别使消费者可以控制记录何时被消费.下面将对其进行更详细的讨论.

This distinction gives the consumer control over when a record is considered consumed. It is discussed in further detail below.

这篇关于kafka 是否区分消耗的偏移量和提交的偏移量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆