卡夫卡如何向许多消费者群体广播 [英] How Kafka broadcast to many Consumer Groups

查看:100
本文介绍了卡夫卡如何向许多消费者群体广播的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Kafka的新手,对于下一个案件的澄清,我将不胜感激.

I am new to Kafka and I will appreciate very much clarification on the next case.

Kafka文档在消费者位置"段落中说:

Kafka documentation says in the paragraph "Consumer Position":

我们的主题分为一组完全有序的分区,每个分区其中有一个消费者在任何给定时间都在消费."

"Our topic is divided into a set of totally ordered partitions, each of which is consumed by one consumer at any given time."

根据上述说明,如果很少有消费者组订阅某个主题,并且生产者将消息发布到该主题内的特定分区,那么只有一个消费者可以拉出该消息.

Based on statement above if few Consumer Groups subscribed to a topic and Producer will publish message to particular partition within this topic then only one Consumer can pull the message.

问题是,如果只有一位消费者可以提取特定消息,那么如何向许多消费者群体进行广播?

The question is how broadcast to many Consumer Groups could happen if only one Consumer can pull particular message?

推荐答案

如果某个主题有10个分区,并且3个使用者实例(按此顺序启动的C1,C2,C3)均属于同一使用者组,则可以具有不同的消耗模型,可以实现如下所示的读取并行性

if there are 10 partitions for a topic and 3 consumer instances (C1,C2,C3 started in that order) all belonging to the same Consumer Group, we can have different consumption models that allow read parallelism as below

每个消费者都使用一个流.在此模型中,当C1启动时,该主题的所有10个分区都映射到同一流,并且C1从该流开始使用.当C2启动时,Kafka将重新平衡两个流之间的分区.因此,每个流将分配给5个分区(取决于重新平衡算法,它也可能是4 vs 6),并且每个使用者都从其流中消费.同样,当C3启动时,分区再次在3个流之间重新平衡.请注意,在此模型中,当从分配给多个分区的流中消费时,消息的顺序将在分区之间混杂.

Each consumer uses a single stream. In this model, when C1 starts all 10 partitions of the topic are mapped to the same stream and C1 starts consuming from that stream. When C2 starts, Kafka rebalances the partitions between the two streams. So, each stream will be assigned to 5 partitions(depending on the rebalance algorithm it might also be 4 vs 6) and each consumer consumes from its stream. Similarly, when C3 starts, the partitions are again rebalanced between the 3 streams. Note that in this model, when consuming from a stream assigned to more than one partition, the order of messages will be jumbled between partitions.

每个消费者使用多个流(例如C1使用3个流,C2使用3个流,而C3使用4个流).在此模型中,当C1启动时,所有10个分区都分配给3个流,并且C1可以使用多个线程从3个流中同时使用.当C2启动时,分区在6个流之间重新平衡,类似地,当C3启动时,分区在10个流之间重新平衡.每个使用者可以同时从多个流中进行消费.请注意,此处的流和分区的数量相等.如果流的数量超过了分区,则某些流将不会获得任何消息,因为它们将不会被分配任何分区.

Each consumer uses more than one stream (say C1 uses 3, C2 uses 3 and C3 uses 4). In this model, when C1 starts, all the 10 partitions are assigned to the 3 streams and C1 can consume from the 3 streams concurrently using multiple threads. When C2 starts, the partitions are rebalanced between the 6 streams and similarly when C3 starts, the partitions are rebalanced between the 10 streams. Each consumer can consume concurrently from multiple streams. Note that the number of streams and partitions here are equal. In case the number of streams exceed the partitions, some streams will not get any messages as they will not be assigned any partitions.

如果存在另一个消费者组,则对该消费者组中的消费者执行相同的过程

If there is another consumer group, the same process is applied to consumers within that consumer group

这篇关于卡夫卡如何向许多消费者群体广播的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆