在kafka中的复制分区下修复 [英] Fixing under replicated partitions in kafka

查看:87
本文介绍了在kafka中的复制分区下修复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在生产环境中,我们经常看到分区在使用主题消息时复制不足.我们正在使用Kafka 0.11.从文档中了解的是

In our production environment, we often see that the partitions go under-replicated while consuming the messages from the topics. We are using Kafka 0.11. From the documentation what is understand is

配置参数 replica.lag.max.messages 已删除.在确定哪些副本同步时,分区负责人将不再考虑滞后消息的数量.

Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync.

配置参数 replica.lag.time.max.ms 现在不仅指自从上次从副本获取请求以来经过的时间,而且还指自副本上次被追上以来的时间.仍在从领导者那里获取消息但未赶上 replica.lag.time.max.ms 中最新消息的副本将被视为不同步.

Configuration parameter replica.lag.time.max.ms now refers not just to the time passed since last fetch request from the replica, but also to time since the replica last caught up. Replicas that are still fetching messages from leaders but did not catch up to the latest messages in replica.lag.time.max.ms will be considered out of sync.

我们如何解决此问题?副本不同步的不同原因是什么?在我们的方案中,我们所有的Kafka代理都位于刀片服务器的单个RACK中,并且都使用具有10GBPS以太网(Simplex)的同一网络.我看不到由于网络原因导致副本不同步的任何原因.

How do we fix this issue? What are the different reasons for replicas go out of sync? In our scenario, we have all the Kafka brokers in the single RACK of the blade servers and all are using the same network with 10GBPS Ethernet(Simplex). I do not see any reason for the replicas to go out of sync due to the network.

推荐答案

我们遇到了同样的问题:

We faced the same issue:

解决方案是:

  1. 重新启动Zookeeper负责人.
  2. 重新启动未复制某些分区的代理\代理.

没有数据丢失.

问题是由于ZK中的状态错误,因此ZK上存在一个未解决的问题,不记得这个数字了.

The issue is due to a faulty state in ZK, there was an opened issue on ZK for this, don't remember the number.

这篇关于在kafka中的复制分区下修复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆