卡夫卡简单的消费者间歇性地丢失消息 [英] Kafka simple consumer intermittently missing messages

查看:51
本文介绍了卡夫卡简单的消费者间歇性地丢失消息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Kafka应用程序,我一直在其中使用kafka-console-consumer.sh来消费消息,如下所示:

I have a Kafka application from where I have been consuming messages using kafka-console-consumer.sh as following:

$./kafka-console-consumer.sh --zookeeper zookeeperhost:2181 --topic myTopic

它提供了我通过Kafka消费者写给Kafka经纪人的所有消息,而没有任何遗漏.

which gives all the messages which I write to Kafka broker through a Kafka consumer without any miss.

最近,我将应用程序部署在无法访问zookeeperhost的其他环境中(由于某种原因).因此,我使用的是kafka-simple-consumer-shell.sh,如下所示:

Recently I deployed the application in a different environment where zookeeperhost is not accessible (due to some reason). So I am using kafka-simple-consumer-shell.sh instead as below:

$./kafka-simple-consumer-shell.sh --broker-list brokerhost:9092 --topic myTopic --partition 0 --max-messages 1

但是,我发现很少有消息(5000条中的2-4条)被遗漏了.有人可以解释一下kafka-simple-consumer-shell.sh如何读取消息.

But with this I see few messages (around 2-4 in 5000) go missed. Could someone please explain how kafka-simple-consumer-shell.sh reads messages.

我怀疑某些消息可能会到达某个不同的分区,并且由于我只是从分区0读取,所以我不会每次都收到所有消息.但是我不知道如何检查有多少个分区?以及其他分区的ID是什么?我尝试使用1,但是它不起作用.

I am doubting that probably some messages are going to some different partition and as I am just reading from partition 0 so I am not getting all the messages every time. But I do not know how to check how many partitions are there? and what are the ids for other partitions? I tried with 1 but it does not work.

有人可以帮忙吗?

推荐答案

kafka-simple-consumer.sh只是创建一个使用者,该使用者从一个分区读取消息.因此,您的命令只需从brokerhost:9092中读取partition 0 of myTopic中的一条消息.如果分区1不在同一代理中,则它将无法像您那样工作. (有关更多信息,请检查来自GitHub的代码)

kafka-simple-consumer.sh simply creates a consumer that reads messages from one partition. So your command simply reads a single message in partition 0 of myTopic from brokerhost:9092. If partition 1 is not in the same broker, it will not work as what you did. (For more information, check Code from GitHub)

如果您可以访问Zookeeper主机,只需使用以下命令检查群集中分区的分布方式

If you can access to the Zookeeper host, you can simply check how partitions are distributed in a cluster with

bin/kafka-topics.sh --describe --zookeeper zookeeperhost:2181 --topic myTopic

但是,如果您无法访问Zookeeper主机,我可以想到两种方法.

but if you can't access to the Zookeeper host, there are two ways as I can think of.

  1. 提供一个以所有代理为参数的列表,并尝试将分区号从0到N.您可以以broker1:port2,broker2:port2,broker3:port3的格式向--broker-list提供多个代理.然后,您可以确定整个集群中存在多少个分区,但是您仍然不知道哪个代理具有哪个分区.
  2. 手动检查每个代理的日志目录.检查/tmp/kafka-logs(如果使用的是默认日志目录).您会发现诸如myTopic-0myTopic-1,...之类的目录,其格式为topic-partition#.您可以使用此方法手动检查哪个代理具有哪些分区.
  1. Provide a list having all brokers as a parameter and try partition numbers from 0 to N. You can provide multiple brokers to --broker-list in a format of broker1:port2,broker2:port2,broker3:port3. Then you can figure out how many partitions exist in the entire cluster, but still you don't know which broker has which partitions.
  2. Manually check a log directory of each broker. Check /tmp/kafka-logs (if you are using a default log directory). You will find directories like myTopic-0, myTopic-1, ... which are in a format of topic-partition#. You can check which broker has which partitions manually with this.

这篇关于卡夫卡简单的消费者间歇性地丢失消息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆