Kafka 如何向多个消费者组广播 [英] How Kafka broadcast to many Consumer Groups

查看:46
本文介绍了Kafka 如何向多个消费者组广播的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Kafka 的新手,我将非常感谢您对下一个案例的澄清.

I am new to Kafka and I will appreciate very much clarification on the next case.

Kafka 文档在Consumer Position"段落中说:

Kafka documentation says in the paragraph "Consumer Position":

"我们的主题分为一组完全有序的分区,每个分区其中在任何给定时间由一个消费者消费."

"Our topic is divided into a set of totally ordered partitions, each of which is consumed by one consumer at any given time."

根据上面的说法,如果很少有消费者组订阅了一个主题,并且生产者将向该主题内的特定分区发布消息,那么只有一个消费者可以拉取消息.

Based on statement above if few Consumer Groups subscribed to a topic and Producer will publish message to particular partition within this topic then only one Consumer can pull the message.

问题是如果只有一个消费者可以拉取特定消息,如何向多个消费者组广播?

The question is how broadcast to many Consumer Groups could happen if only one Consumer can pull particular message?

推荐答案

如果一个主题有 10 个分区和 3 个消费者实例(C1、C2、C3 依次启动)都属于同一个消费者组,我们可以有不同的消费模型,允许读取并行如下

if there are 10 partitions for a topic and 3 consumer instances (C1,C2,C3 started in that order) all belonging to the same Consumer Group, we can have different consumption models that allow read parallelism as below

每个消费者使用一个流.在此模型中,当 C1 启动时,主题的所有 10 个分区都映射到同一个流,并且 C1 开始从该流消费.当 C2 启动时,Kafka 会重新平衡两个流之间的分区.因此,每个流将被分配到 5 个分区(取决于重新平衡算法,它也可能是 4 对 6)并且每个消费者从其流中消费.类似地,当 C3 启动时,分区再次在 3 个流之间重新平衡.请注意,在此模型中,当从分配给多个分区的流中消费时,消息的顺序将在分区之间混乱.

Each consumer uses a single stream. In this model, when C1 starts all 10 partitions of the topic are mapped to the same stream and C1 starts consuming from that stream. When C2 starts, Kafka rebalances the partitions between the two streams. So, each stream will be assigned to 5 partitions(depending on the rebalance algorithm it might also be 4 vs 6) and each consumer consumes from its stream. Similarly, when C3 starts, the partitions are again rebalanced between the 3 streams. Note that in this model, when consuming from a stream assigned to more than one partition, the order of messages will be jumbled between partitions.

每个消费者使用多个流(比如 C1 使用 3,C2 使用 3,C3 使用 4).在这个模型中,当 C1 启动时,所有 10 个分区都分配给 3 个流,C1 可以使用多个线程从 3 个流中同时消费.当 C2 启动时,分区在 6 个流之间重新平衡,类似地,当 C3 启动时,分区在 10 个流之间重新平衡.每个消费者可以从多个流中同时消费.请注意,这里的流和分区的数量是相等的.如果流的数量超过分区数,则某些流将不会收到任何消息,因为它们不会被分配任何分区.

Each consumer uses more than one stream (say C1 uses 3, C2 uses 3 and C3 uses 4). In this model, when C1 starts, all the 10 partitions are assigned to the 3 streams and C1 can consume from the 3 streams concurrently using multiple threads. When C2 starts, the partitions are rebalanced between the 6 streams and similarly when C3 starts, the partitions are rebalanced between the 10 streams. Each consumer can consume concurrently from multiple streams. Note that the number of streams and partitions here are equal. In case the number of streams exceed the partitions, some streams will not get any messages as they will not be assigned any partitions.

如果有另一个消费者组,同样的过程应用到该消费者组内的消费者

If there is another consumer group, the same process is applied to consumers within that consumer group

这篇关于Kafka 如何向多个消费者组广播的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆