在什么情况下endOffset >lastMsg.offset + 1? [英] Under what circumstances is endOffset > lastMsg.offset + 1?

查看:23
本文介绍了在什么情况下endOffset >lastMsg.offset + 1?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Kafka 为一个分区返回 endOffset 15,但可以从中消费的最后一条消息的偏移量为 13,而不是我期望的 14.我想知道为什么.

Kafka returns endOffset 15 for a partition, but the last message that can be consumed from has the offset 13, rather than 14, which I would expect. I wonder why.

Kafka 文档 阅读

在默认的read_uncommitted隔离级别下,结束偏移量是高水印(即最后成功复制的消息的偏移量加一).对于 read_committed 消费者,结束偏移量是最后一个稳定偏移量 (LSO),它是高水印中的最小值和任何打开事务的最小偏移量.

In the default read_uncommitted isolation level, the end offset is the high watermark (that is, the offset of the last successfully replicated message plus one). For read_committed consumers, the end offset is the last stable offset (LSO), which is the minimum of the high watermark and the smallest offset of any open transaction.

这是 kafkacat 的输出.我正在使用 kafkacat,因为它可以打印消息偏移量:

Here's kafkacat's output. I'm using kafkacat, because it can print the message offsets:

$ kafkacat -Ce -p0 -tTK -f'offset: %o key: %k\n'
offset: 0 key: 0108
offset: 1 key: 0253
offset: 4 key: 0278
offset: 5 key: 0198
offset: 8 key: 0278
offset: 9 key: 0210
offset: 10 key: 0253
offset: 11 key: 1058
offset: 12 key: 0141
offset: 13 key: 1141
% Reached end of topic TK [0] at offset 15: exiting

同样令人困惑的是——这很可能是相关的——偏移量不是连续的,尽管我没有设置压缩等.

What's also baffling - and it may very well be related - is that the offsets are not consecutive, although I have not set up compaction etc.

更多细节:

$ kafka-topics.sh --bootstrap-server localhost:9092 --topic TK --describe
Topic: TK       PartitionCount: 2       ReplicationFactor: 1    Configs: segment.bytes=1073741824
        Topic: TK       Partition: 0    Leader: 0       Replicas: 0     Isr: 0
        Topic: TK       Partition: 1    Leader: 0       Replicas: 0     Isr: 0

通过 kafka-console-consumer.sh 打印密钥:

Printing the keys via kafka-console-consumer.sh:

$ kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic TK \
  --offset earliest --partition 0 --timeout-ms 5000 \
  --property print.key=true --property print.value=false
0108
0253
0278
0198
0278
0210
0253
1058
0141
1141
[2021-09-15 10:54:06,556] ERROR Error processing message, terminating consumer process:  (kafka.tools.ConsoleConsumer$)
org.apache.kafka.common.errors.TimeoutException
Processed a total of 10 messages

注意本主题已制作为不涉及交易,并且 *)消费以read_uncommitted模式进行.

N.B. This topic has been produced to without involvement of transactions, and *) consumption is being done in read_uncommitted mode.

*) 实际上,processing.guarantee 设置为 exactly_once_beta,因此这相当于使用交易.

*) Actually, processing.guarantee is set to exactly_once_beta, so that would amount to using transactions.

更多信息事实证明,我可以使用我的 Streams 应用程序可靠地重现这种情况(1. 擦除 kafka/zookeeper 数据,2. 重新创建主题,3. 运行应用程序),其输出是显示此问题的主题.与此同时,我已将 Streams 应用程序精简为这种无操作拓扑,并且仍然可以重现它:

More info It turns out I can reliably reproduce this case with my Streams app (1. wipe kafka/zookeeper data, 2. recreate topics, 3. run app), whose output is the topic that shows this problem. I've meanwhile trimmed down the Streams app to this no-op topology and can still reproduce it:

Topologies:
   Sub-topology: 0
    Source: KSTREAM-SOURCE-0000000000 (topics: [TK1])
      --> KSTREAM-SINK-0000000001
    Sink: KSTREAM-SINK-0000000001 (topic: TK)
      <-- KSTREAM-SOURCE-0000000000


新闻同时,我将本地运行的 Kafka 代理 (2.5.0) 替换为在 Docker 容器中运行的代理 (wurstmeister/kafka:2.13-2.6.0).问题依旧.


News Meanwhile I have replaced the locally running Kafka broker (2.5.0) with one running in a Docker container (wurstmeister/kafka:2.13-2.6.0). The problem persists.

该应用使用的 kafka 库版本为 6.0.1-ccs,对应于 2.6.0.

The app is using kafka libraries versioned 6.0.1-ccs, corresponding to 2.6.0.

推荐答案

当我删除设置 processing.guarantee: exact_once_beta 时,问题就消失了.就这个问题而言,我使用exactly_once_beta 还是exactly_once 都没有关系.

When I remove the setting processing.guarantee: exactly_once_beta the problem goes away. In terms of this problem, it doesn't matter whether I use exactly_once_beta or exactly_once.

我仍然想知道为什么会在 Exactly_once(_beta) 中发生这种情况——毕竟,在我的测试中,一切顺利,没有事务回滚等.

I still wonder why that happens with exactly_once(_beta) - after all, in my tests there is smooth sailing and no transaction rollbacks etc.

在我最近的测试中,此规则似乎适用于其中至少包含一项的所有分区:

In my latest tests this rule seems to apply to all partitions with at least one item in them:

endOffset == lastMsg.offset + 3

比预期多 2 个.

问题中提到的 Kafka 文档说

The Kafka docs mentioned in the question say that

对于 read_committed 消费者,结束偏移量是最后一个稳定偏移量 (LSO),它是高水印的最小值和任何打开事务的最小偏移量.

For read_committed consumers, the end offset is the last stable offset (LSO), which is the minimum of the high watermark and the smallest offset of any open transaction.

Kafka 是否可能为每个分区的 2 (???) 个事务预先分配偏移量?

So is Kafka perhaps pre-allocating offsets for 2 (???) transactions per partition?

这篇关于在什么情况下endOffset >lastMsg.offset + 1?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆