Kafka 分区和偏移量消失了 [英] Kafka partitions and offsets disappeared

查看:57
本文介绍了Kafka 分区和偏移量消失了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的 Kafka 客户端在启用了自动缩放的 GCP App Engine Flex 环境中运行(GCP 将实例数保持在至少两个,并且由于 CPU 使用率低,它主要是 2 个).在这 2 个虚拟机中运行的消费者组几个月来一直在消费来自 20 个分区中各种主题的消息,最近我注意到旧主题中的分区缩小到仅 1 (!),并且该消费者组的偏移量被重置为 0.主题-[partition] 目录也从 kafka-logs 目录中消失了.奇怪的是,最近创建的主题分区是完整的.我有 3 个不同的环境(都在 GCP 中),这三种情况都发生了.我们没有看到任何丢失的消息或数据问题,但想了解发生了什么以避免再次发生这种情况.

My Kafka clients are running in GCP App Engine Flex environment with auto scale enabled (GCP keeps the instance count to at least two and it has been mostly 2 due to low CPU usages). The consumer groups running in that 2 VMs have been consuming messages from various topics in 20 partitions for several months and recently I noticed that partitions in older topics shrank to just 1 (!) and offsets for that consumer group was reset to 0. topic-[partition] directories were also gone from the kafka-logs directory. Strangely, recently created topic partitions are intact. I have 3 different environments (all in GCP) and this happened to all three. We didn't see any lost messages or data problem but want to understand what had happened to avoid this happening again.

kafka broker 和 zookeeper 运行在同一个单一的 GCP 计算引擎实例中(我知道这不是最佳实践,并且有改进的计划),我怀疑这与机器重启有关,这会抹掉一些信息.但是,我验证了数据文件是写在/opt/bitnami/(kafka|bitnami) 目录下的,而不是可以通过机器重启删除的/tmp .

The kafka broker and zookeeper are running in the same and single GCP compute engine instance (I know it's not the best practice and have plan to improve) and I suspect it has something to do with machine restart and that wipes out some information. However, I verified that data files are written under /opt/bitnami/(kafka|bitnami) directory and not /tmp which can be removed by machine restarts.

  • 春季卡夫卡 1.1.3
  • kafka 客户端 0.10.1.1
  • 单节点 kafka 代理 0.10.1.0
  • 单节点zookeeper 3.4.9

对此的任何见解将不胜感激!

Any insights on this will be appreciated!

推荐答案

Bitnami 开发人员在这里.我可以重现该问题并将其追溯到正在清除 tmp/kafka-logs/ 文件夹内容的初始化脚本.

Bitnami developer here. I could reproduce the issue and track it down to an init script that was clearing the content of the tmp/kafka-logs/ folder.

我们发布了 kafka 安装程序的新版本,虚拟机云图片 解决了这个问题.包含修复程序的修订版是 1.0.0-2.

We released a new revision of the kafka installers, virtual machines and cloud images fixing the issue. The revision that includes the fix is 1.0.0-2.

这篇关于Kafka 分区和偏移量消失了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆