共同分区如何确保将来自2个不同主题的分区最终分配给相同的Kafka Stream Task? [英] How do co-partitioning ensure that partition from 2 different topics end up assigned to the same Kafka Stream Task?

查看:88
本文介绍了共同分区如何确保将来自2个不同主题的分区最终分配给相同的Kafka Stream Task?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

虽然我了解如此处所述进行共分区的前提条件

while i understand the pre-requisite of having co-partitioning as explained here Why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams? , I do not understand the mechanism that make sure that the partitions of each topic that have the same key, get assigned to the same KAFKA Stream. I do not see how the consumer group of KAFKA would enable that.

我的理解方式是,我们有2个独立的消费者组,实际上它们可能具有相同的名称,因为它是相同的kafka流应用程序,尽管每个主题的订阅都彼此独立.

The way i understand it is that, we have 2 independent consumer groups, which actually may have the same name, because it is the same kafka stream application, although the suscription to each topic is independent from each other.

以某种方式,每个使用者组中的使用者被分配到包含相同密钥的分区.我不知道消费者对分区的分配可能与分区的内容有关.到目前为止,我虽然是随机的.

Somehow, the consumers in each consumer group, get assigned to partition that contains the same key. I did not know that consumer assignment to partition could be related to the content of the partitions. So far i though it was random.

有人可以解释这部分吗?

Can someone explain that part ?

推荐答案

我的理解是,我们有2个独立的消费者组,实际上它们可能具有相同的名称,因为它是相同的kafka流应用程序,尽管每个主题的订阅彼此独立.

The way i understand it is that, we have 2 independent consumer groups, which actually may have the same name, because it is the same kafka stream application, although the suscription to each topic is independent from each other.

使用者组的所有成员都具有相同的名称"(即group.id)-不可能有两个使用者组具有相同的名称.它将是一个消费群体.

All members of a consumer group have the same "name" (ie, group.id) -- it is not possible to have two consumer groups with the same name. It would be one consumer group.

尽管每个主题的订阅彼此独立

although the suscription to each topic is independent from each other

对于KafkaConsumer,可以为组中的不同成员进行不同的订阅(即使这是非常罕见的情况).但是,对于Kafka Streams,要求该组的所有成员(即应用程序实例)执行与某些输入主题完全相同的Topology(即,它们的订阅必须相同).

For KafkaConsumer it's possible to have different subscription for different members in the group (even if this should be a very rare scenario). For Kafka Streams however, it is required that all members of the group (ie, application instances) execute the exact some Topology with the exact some input topics (ie, their subscription must be the same).

我不知道消费者对分区的分配可能与分区的内容有关.到目前为止,尽管它是随机的.

I did not know that consumer assignment to partition could be related to the content of the partitions. So far i though it was random.

是的.

根据您自己的答案:

换句话说,如果分区数相同,并且每个主题的生产者的分区策略相同,则具有相同关键字的消息将以相同的方式分配给分区范围,该分区范围被分配给消费者以相同的方式,即作为每个主题的分区的连续子集.因此,同一流任务将始终具有两个具有相同键的主题的分区.

In other words, if the number of partitions is the same, and the partition strategy of each producer of the topic is the same, message with same key will be assigned in the same way on the partition range, which is assigned to the consumer in the same way, i.e. as consecutive subset of partitions from each topic. Hence The same stream task will always have partitions of both topics which have the same key.

那也是正确的.

请注意,Kafka Streams使用特殊的分区分配器(不是用户提供的默认分区分配器)来确保共同分区,粘性(即状态存储感知)并分配备用任务.

Note, that Kafka Streams uses a special partition assignor (not the default ones the consumer offers) to ensure co-partitioning, stickiness (ie, state-store awareness), and to assign standby-tasks.

这篇关于共同分区如何确保将来自2个不同主题的分区最终分配给相同的Kafka Stream Task?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆