连续的消费群体重新平衡,消费者人数超过分区人数 [英] Continuous consumer group rebalancing with more consumers than partitions

查看:115
本文介绍了连续的消费群体重新平衡,消费者人数超过分区人数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

进行以下设置:

  • Kafka v0.11.0.0
  • 3个经纪人
  • 2个主题,每个主题有2个分区,复制因子为3
  • 2个消费群体,每个主题一个
  • 3个包含使用者的服务器

服务器包含两个使用者,每个使用者一个,例如:

The servers contain two consumers, one for each topic such that:

  • 服务器A
    • 主题1消费组-1中的消费者A1
    • 类别2消费组2中的消费者A2
    • Server A
      • consumer-A1 in group topic-1-group consuming topic-1
      • consumer-A2 in group topic-2-group consuming topic-2
      • 类别1消费组1中的消费者B1
      • 类别2消费组中的消费者B2类别2
      • 类别1消费组-1中的消费者C1
      • 类别2消费组2中的消费者C2

      在这种情况下,当我们检查组topic-1-group的kafka-consumer-groups.bat的输出时,我们看到以下内容:

      In this scenario, when we examine the output of kafka-consumer-groups.bat for group topic-1-group, we see the following:

      • 将消费者B1分配给topic-1分区1
      • 将消费者C1分配给主题1分区0
      • 消费者A1未分配任何分区

      这似乎是我们所期望的.由于分区计数为2,因此我们只有两个活动的使用者.第三个消费者只是闲着.我们可以很好地使用该主题中的消息.

      This appears to be as we would expect. Since the partition count is 2, we only have two active consumers. The third consumer is just idle. We are able to consume messages from the topic just fine.

      接下来,我们关闭服务器B(已将其主动分配给分区).这样做,我们希望topic-1-group进入重新平衡,并希望consumer-A1代替consumer-B1并分配给一个分区,使得以下情况成立:

      Next, we shutdown Server B (who is actively assigned to a partition). Doing so, we would expect topic-1-group to enter rebalancing and expect that consumer-A1 would take the place of consumer-B1 and be assigned to a partition such that the following is true:

      • 将消费者A1分配给topic-1分区1
      • 将消费者C1分配给主题1分区0
      • consumer-B1不再分配,因为它不再处于活动状态

      不过,我们看到的是,消费者组topic-1-group进入了重新平衡的状态,这种状态似乎并没有停止.由于该组正在重新平衡,因此心跳似乎也失败了.

      What we are seeing happen, though, is the consumer group topic-1-group enters a state of rebalancing that doesn't seem to stop. Heartbeats also seem to fail since the group is in rebalancing.

      从中恢复的唯一方法是关闭另一台服务器,以便主题1组只有一个使用者.当只有一个消费者时,我们能够成功接收该主题的消息.接下来,如果我们启动其他两台服务器,我们将继续成功接收有关该主题的消息.

      The only way to recover from this is to shutdown another server so that there is only one consumer for topic-1-group. When there is only one consumer, we are able to successfully receive messages for the topic. Next, if we start up the other two servers, we continue to receive messages successfully for the topic.

      问题

      • 这是有效的使用情况吗?
      • 在这种情况下会发生什么?
      • 消费者有问题吗? (就配置而言,除了设置诸如主题,消费者组等基础知识外,我们都使用默认值.我们使用KafkaConsumer.subscribe(Collection)而不是手动分配分区)
      • 经纪人/Zookeeper可能有问题吗?

      推荐答案

      (我将作为答案发布,因为我还没能力发表评论.这可能是答案",尽管不尽人意:更多消费者,而不是分区是不受支持的配置.

      (I'll post as an answer since I'm not cool enough to comment. And this may be 'the answer', albeit an unsatisfying one: more consumers than partitions is not a supported configuration).

      根据kafka文档: https://kafka.apache.org/documentation.html#introduction 通过在主题内具有并行性(即分区)的概念,Kafka能够在一系列使用者进程上提​​供顺序保证和负载均衡.这是通过将主题中的分区分配给消费者组中的消费者来实现的,以便每个分区都被组中的一个消费者完全消费.通过这样做,我们确保使用者是该分区的唯一读取器,并按顺序使用数据.由于存在许多分区,因此仍然可以平衡许多使用者实例上的负载.但是请注意,使用者组中的使用者实例不能超过分区.

      According to the kafka documentation: https://kafka.apache.org/documentation.html#introduction By having a notion of parallelism—the partition—within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer processes. This is achieved by assigning the partitions in the topic to the consumers in the consumer group so that each partition is consumed by exactly one consumer in the group. By doing this we ensure that the consumer is the only reader of that partition and consumes the data in order. Since there are many partitions this still balances the load over many consumer instances. Note however that there cannot be more consumer instances in a consumer group than partitions.

      在实践中,额外的消费者一直处于闲置状态,直到活跃的消费者离开为止,这似乎有时使他们处于永久性重新平衡的状态.

      While in practice, the extra consumer stays idle until an active consumer goes away, it seems to sometimes get in a state where it is perpetually rebalancing.

      此stackoverflow线程(在Apache Kafka为什么不能有更多的消费者实例而不是分区?)讨论了这个问题,并讨论了为什么想要的消费者数量少于分区的消费者,却没有说明当拥有更多消费者实例时会发生什么.有趣的评论之一说明了为什么您可能想要配置更多(用于故障转移)但没有答复的原因: 现在,我们还要确保即使某些使用者实例发生故障,每个使用者实例仍具有一个分区.这样做的逻辑方法是将更多的消费者添加到该组中.虽然一切正常,他们什么也不会做,但是当某个消费者失败时,其中一个会收到该分区.为什么不允许这样做?

      This stackoverflow thread (In Apache Kafka why can't there be more consumer instances than partitions?) discusses the issue and talks about why you'd want fewer consumers than partitions but doesn't say what happens when you have more. One of the interesting comments gives a reason why you may want to configure more (for failover) but there were no replies: now we additionaly want to make sure that even if some of consumer instances fails we still have one partition per consumer instance. Logical way of doing this would be to add more consumers to the group; while everything is OK they wouldn't do anything, but when some consumer fails one of them would receive that partition.Why is this not allowed?

      这篇关于连续的消费群体重新平衡,消费者人数超过分区人数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆