是什么决定卡夫卡消费者抵消? [英] What determines Kafka consumer offset?

查看:187
本文介绍了是什么决定卡夫卡消费者抵消?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对卡夫卡比较新。我已经做了一些实验,但有一些事情我不清楚消费者抵消。从我到目前为止所理解的情况来看,当消费者开始时,它将开始读取的偏移量由配置设置 auto.offset.reset 确定(如果我是错误)。

I am relatively new to Kafka. I have done a bit of experimenting with it, but a few things are unclear to me regarding consumer offset. From what I have understood so far, when a consumer starts, the offset it will start reading from is determined by the configuration setting auto.offset.reset (correct me if I am wrong).

现在说主题中有10条消息(偏移0到9),并且消费者在发生故障之前恰好消耗了5条消息(或在我杀死消费者之前)。然后说我重启那个消费者流程。我的问题是:

Now say for example that there are 10 messages (offsets 0 to 9) in the topic, and a consumer happened to consume 5 of them before it went down (or before I killed the consumer). Then say I restart that consumer process. My questions are:


  1. 如果 auto.offset.reset 是设置为最小,是否总是从偏移0开始消费?

  1. If the auto.offset.reset is set to smallest, is it always going to start consuming from offset 0 ?

如果 auto.offset.reset 设置为最大,是否会从偏移5开始消费?

If the auto.offset.reset is set to largest, is it going to start consuming from offset 5 ?

这种情况的行为是否总是确定的?

Is the behaviour regarding this kind of scenario always deterministic ?

如果我的问题中的任何内容不清楚,请不要犹豫。在此先感谢。

Please don't hesitate to comment if anything in my question is unclear. Thanks in advance.

推荐答案

它比你描述的要复杂一点。如果你的消费者组没有在某处提交有效的偏移量,那么 auto.offset.reset 配置只会启动(2个支持的偏移存储现在是Kafka和Zookeeper)。它还取决于您使用的消费者类型。

It is a bit more complex than you described. The auto.offset.reset config kicks in ONLY if your consumer group does not have a valid offset committed somewhere (2 supported offset storages now are Kafka and Zookeeper). And it also depends on what sort of consumer you use.

如果您使用高级别的Java消费者,那么请想象以下情况:

If you use a high-level java consumer then imagine following scenarios:


  1. 您的消费者群体中的消费者 group1 消耗了5条消息而已经死亡。下次启动此消费者时,它甚至不会使用 auto.offset.reset 配置,并将从它死亡的地方继续,因为它只会从中获取存储的偏移量偏移存储(我提到的Kafka或ZK)。

  1. You have a consumer in a consumer group group1 that has consumed 5 messages and died. Next time you start this consumer it won't even use that auto.offset.reset config and will continue from the place it died because it will just fetch the stored offset from the offset storage (Kafka or ZK as I mentioned).

您在主题中有消息(如您​​所述),并在新的消费者群体中启动消费者组2 。在任何地方都没有存储偏移量,这次 auto.offset.reset config将决定是否从主题的开头开始(最小)或主题末尾(最大

You have messages in a topic (like you described) and you start a consumer in a new consumer group group2. There is no offset stored anywhere and this time the auto.offset.reset config will decide whether to start from the beginning of the topic (smallest) or from the end of the topic (largest)

影响什么偏移值将对应最小最大配置的另一件事是日志保留政策。想象一下,您的主题保留配置为1小时。您生成5条消息,然后一小时后再发布5条消息。 最大偏移量仍将保持与前一个示例相同,但最小一个将无法 0 因为Kafka已经删除了这些消息,因此最小的可用偏移量将是 5

One more thing that affects what offset value will correspond to smallest and largest configs is log retention policy. Imagine you have a topic with retention configured to 1 hour. You produce 5 messages, and then an hour later you post 5 more messages. The largest offset will still remain the same as in previous example but the smallest one won't be able to be 0 because Kafka will already remove these messages and thus the smallest available offset will be 5.

上面提到的所有内容都与 SimpleConsumer 无关,每次运行时,它都会决定从哪里开始使用 auto.offset.reset config。

Everything mentioned above is not related to SimpleConsumer and every time you run it, it will decide where to start from using the auto.offset.reset config.

这篇关于是什么决定卡夫卡消费者抵消?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆