kafka是否可以区分消耗的偏移量和提交的偏移量? [英] Does kafka distinguish between consumed offset and commited offset?

查看:108
本文介绍了kafka是否可以区分消耗的偏移量和提交的偏移量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据我的理解,消费者从特定主题中读取消息,消费者客户端将定期提交偏移量.

From what I understand a consumer reads messages off a particular topic, and the consumer client will periodically commit the offset.

因此,如果使用者由于某种原因未能通过一条特定的消息,则不会提交该偏移量,然后您可以返回并重新处理该消息.

So if for some reason the consumer fails a particular message, that offset won't be committed and you can then go back and reprocess he message.

是否有任何跟踪您刚刚消耗的偏移量和随后提交的偏移量的东西?

Is there anything that tracks the offset you just consumed and the offset you then commit?

推荐答案

kafka可以区分消耗的偏移量和提交的偏移量吗?

Does kafka distinguish between consumed offset and commited offset?

是的,有很大的不同. 已消耗偏移量由使用者管理,以使使用者将从主题分区中提取后续消息.

Yes, there is a big difference. The consumed offset is managed by the consumer in such a way that the consumer will fetch subsequent messages out of a topic partition.

使用者可以(但不是必须)自动或通过调用提交API来提交消息.该信息存储在名为__consumer_offsets的Kafka内部主题中,并基于ConsumerGroup,Topic和Partition存储提交的偏移量.如果客户端开始重新启动或新的使用者加入/退出使用者组,则将使用它.

The consumer can (but it is not a must) commit a message either automatically or by calling the commit API. The information is stored in a Kafka internal topic called __consumer_offsets and stores the committed offset based on ConsumerGroup, Topic and Partition. It will be used if the client is getting restartet or a new consumer joins/leaves the ConsumerGroup.

请记住,如果您的客户端未提交偏移量n,但后来提交了偏移量n+1,则对于Kafka,与您同时提交两个偏移量的情况一样.

Just keep in mind that if your client does not committ offset n but later committs offset n+1, for Kafka it won't make a different to the case when you commit both offsets.

有关已消耗已提交偏移的更多详细信息,可以在KafkaConsumer的JavaDocs中的

More details on consumed and committed offsets can be found in the JavaDocs of KafkaConsumer on Offsets and Consumer Position:

Kafka维护分区中每个记录的数字偏移量.此偏移量充当该分区内记录的唯一标识符,并且还指示使用者在分区中的位置.例如,位置5的消费者使用了偏移量为0到4的记录,然后将接收偏移量为5的记录.实际上,有两个与消费者的用户相关的位置概念:

Kafka maintains a numerical offset for each record in a partition. This offset acts as a unique identifier of a record within that partition, and also denotes the position of the consumer in the partition. For example, a consumer which is at position 5 has consumed records with offsets 0 through 4 and will next receive the record with offset 5. There are actually two notions of position relevant to the user of the consumer:

使用者的位置给出了将要给出的下一个记录的偏移量.这将比消费者在该分区中看到的最大偏移量大一个.每当使用者在对poll(Duration)的呼叫中收到消息时,它都会自动前进.

The position of the consumer gives the offset of the next record that will be given out. It will be one larger than the highest offset the consumer has seen in that partition. It automatically advances every time the consumer receives messages in a call to poll(Duration).

已提交位置是已安全存储的最后一个偏移量.如果该过程失败并重新启动,则这是使用者将恢复到的偏移量.使用者可以定期自动提交偏移量;或者它可以选择通过调用一种提交API(例如commitSync和commitAsync)来手动控制此提交位置.

The committed position is the last offset that has been stored securely. Should the process fail and restart, this is the offset that the consumer will recover to. The consumer can either automatically commit offsets periodically; or it can choose to control this committed position manually by calling one of the commit APIs (e.g. commitSync and commitAsync).

这种区别使消费者可以控制何时将记录视为已消费.下面将对其进行详细讨论.

This distinction gives the consumer control over when a record is considered consumed. It is discussed in further detail below.

这篇关于kafka是否可以区分消耗的偏移量和提交的偏移量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆