Kafka分区在某些节点上不同步 [英] Kafka partitions out of sync on certain nodes

查看：42 发布时间：2021/4/8 18:52:22 apache-kafka apache-zookeeper

本文介绍了Kafka分区在某些节点上不同步的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在3个EC2实例上运行Kafka集群.每个实例都运行kafka(0.11.0.1)和zookeeper(3.4).我的主题已配置为每个都有20个分区，而ReplicationFactor为3.

I'm running a Kafka cluster on 3 EC2 instances. Each instance runs kafka (0.11.0.1) and zookeeper (3.4). My topics are configured so that each has 20 partitions and ReplicationFactor of 3.

今天，我注意到某些分区拒绝同步到所有三个节点.这是一个示例:

Today I noticed that some partitions refuse to sync to all three nodes. Here's an example:

bin/kafka-topics.sh --zookeeper "10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181" --describe --topic prod-decline
Topic:prod-decline    PartitionCount:20    ReplicationFactor:3    Configs:
    Topic: prod-decline    Partition: 0    Leader: 2    Replicas: 1,2,0    Isr: 2
    Topic: prod-decline    Partition: 1    Leader: 2    Replicas: 2,0,1    Isr: 2
    Topic: prod-decline    Partition: 2    Leader: 0    Replicas: 0,1,2    Isr: 2,0,1
    Topic: prod-decline    Partition: 3    Leader: 1    Replicas: 1,0,2    Isr: 2,0,1
    Topic: prod-decline    Partition: 4    Leader: 2    Replicas: 2,1,0    Isr: 2
    Topic: prod-decline    Partition: 5    Leader: 2    Replicas: 0,2,1    Isr: 2
    Topic: prod-decline    Partition: 6    Leader: 2    Replicas: 1,2,0    Isr: 2
    Topic: prod-decline    Partition: 7    Leader: 2    Replicas: 2,0,1    Isr: 2
    Topic: prod-decline    Partition: 8    Leader: 0    Replicas: 0,1,2    Isr: 2,0,1
    Topic: prod-decline    Partition: 9    Leader: 1    Replicas: 1,0,2    Isr: 2,0,1
    Topic: prod-decline    Partition: 10    Leader: 2    Replicas: 2,1,0    Isr: 2
    Topic: prod-decline    Partition: 11    Leader: 2    Replicas: 0,2,1    Isr: 2
    Topic: prod-decline    Partition: 12    Leader: 2    Replicas: 1,2,0    Isr: 2
    Topic: prod-decline    Partition: 13    Leader: 2    Replicas: 2,0,1    Isr: 2
    Topic: prod-decline    Partition: 14    Leader: 0    Replicas: 0,1,2    Isr: 2,0,1
    Topic: prod-decline    Partition: 15    Leader: 1    Replicas: 1,0,2    Isr: 2,0,1
    Topic: prod-decline    Partition: 16    Leader: 2    Replicas: 2,1,0    Isr: 2
    Topic: prod-decline    Partition: 17    Leader: 2    Replicas: 0,2,1    Isr: 2
    Topic: prod-decline    Partition: 18    Leader: 2    Replicas: 1,2,0    Isr: 2
    Topic: prod-decline    Partition: 19    Leader: 2    Replicas: 2,0,1    Isr: 2

只有节点2具有所有同步数据.我尝试过重新启动代理0和1，但这并不能改善情况-甚至更糟.我很想重新启动节点2，但是我假设它将导致停机或群集故障，因此，我希望尽可能避免这种情况.

Only node 2 has all the data in-sync. I've tried restarting brokers 0 and 1 but it didn't improve the situation - it made it even worse. I'm tempted to restart node 2 but I'm assuming it will lead to downtime or cluster failure so I'd like to avoid it if possible.

我没有在日志中看到任何明显的错误，因此我很难确定如何调试情况.任何提示将不胜感激.

I'm not seeing any obvious errors in logs so I'm having a hard time figuring out how to debug the situation. Any tips would be greatly appreciated.

谢谢！

一些附加信息...如果我检查节点2(具有完整数据的节点)上的度量标准，它的确会意识到某些分区没有正确复制.

Some additional info ... If I check the metrics on node 2 (the one with full data), it does realize that some partitions are not correctly replicated.:

$>get -d kafka.server -b kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions *
#mbean = kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions:
Value = 930;

节点0和1没有.他们似乎认为一切都很好:

Nodes 0 and 1 don't. They seem to think everything is fine:

$>get -d kafka.server -b kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions *
#mbean = kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions:
Value = 0;

这是预期的行为吗?

Kafka分区在某些节点上不同步 [英] Kafka partitions out of sync on certain nodes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Kafka分区在某些节点上不同步 [英] Kafka partitions out of sync on certain nodes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭