联合分区如何确保来自 2 个不同主题的分区最终分配给同一个 Kafka Stream 任务? [英] How do co-partitioning ensure that partition from 2 different topics end up assigned to the same Kafka Stream Task?

查看:20
本文介绍了联合分区如何确保来自 2 个不同主题的分区最终分配给同一个 Kafka Stream 任务?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

虽然我理解这里解释的共同分区的先决条件为什么 kafka 中两个 Kstream 的共同分区需要两个流的分区数相同? ,我不明白确保具有相同键的每个主题的分区被分配到相同的 KAFKA 流.我不知道 KAFKA 的消费者群体将如何实现这一点.

while i understand the pre-requisite of having co-partitioning as explained here Why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams? , I do not understand the mechanism that make sure that the partitions of each topic that have the same key, get assigned to the same KAFKA Stream. I do not see how the consumer group of KAFKA would enable that.

我的理解是,我们有2个独立的consumer group,实际上可能同名,因为是同一个kafka流应用,虽然每个topic的订阅是相互独立的.

The way i understand it is that, we have 2 independent consumer groups, which actually may have the same name, because it is the same kafka stream application, although the suscription to each topic is independent from each other.

不知何故,每个消费者组中的消费者被分配到包含相同键的分区.我不知道消费者对分区的分配可能与分区的内容有关.到目前为止,我虽然它是随机的.

Somehow, the consumers in each consumer group, get assigned to partition that contains the same key. I did not know that consumer assignment to partition could be related to the content of the partitions. So far i though it was random.

有人能解释一下那部分吗?

Can someone explain that part ?

推荐答案

我的理解是,我们有2个独立的consumer group,实际上可能同名,因为是同一个kafka流应用,虽然每个topic的订阅是相互独立的.

The way i understand it is that, we have 2 independent consumer groups, which actually may have the same name, because it is the same kafka stream application, although the suscription to each topic is independent from each other.

一个消费者组的所有成员都具有相同的名称"(即,group.id)——不可能有两个具有相同名称的消费者组.这将是一个消费者群体.

All members of a consumer group have the same "name" (ie, group.id) -- it is not possible to have two consumer groups with the same name. It would be one consumer group.

虽然每个主题的订阅是相互独立的

although the suscription to each topic is independent from each other

对于 KafkaConsumer 可以为组中的不同成员提供不同的订阅(即使这应该是非常罕见的情况).然而,对于 Kafka Streams,要求组的所有成员(即应用程序实例)使用精确的一些输入主题(即,他们的订阅必须相同)执行精确的一些Topology.

For KafkaConsumer it's possible to have different subscription for different members in the group (even if this should be a very rare scenario). For Kafka Streams however, it is required that all members of the group (ie, application instances) execute the exact some Topology with the exact some input topics (ie, their subscription must be the same).

我不知道消费者对分区的分配可能与分区的内容有关.到目前为止,我认为它是随机的.

I did not know that consumer assignment to partition could be related to the content of the partitions. So far i though it was random.

没错.

来自您自己的回答:

换句话说,如果分区数相同,并且主题的每个生产者的分区策略相同,则在分区范围上以相同的方式分配具有相同键的消息,分配给消费者以相同的方式,即作为每个主题的连续分区子集.因此,同一个流任务将始终具有具有相同键的两个主题的分区.

In other words, if the number of partitions is the same, and the partition strategy of each producer of the topic is the same, message with same key will be assigned in the same way on the partition range, which is assigned to the consumer in the same way, i.e. as consecutive subset of partitions from each topic. Hence The same stream task will always have partitions of both topics which have the same key.

这也是正确的.

请注意,Kafka Streams 使用特殊的分区分配器(不是消费者提供的默认分配器)来确保共同分区、粘性(即状态存储感知)和分配备用任务.

Note, that Kafka Streams uses a special partition assignor (not the default ones the consumer offers) to ensure co-partitioning, stickiness (ie, state-store awareness), and to assign standby-tasks.

这篇关于联合分区如何确保来自 2 个不同主题的分区最终分配给同一个 Kafka Stream 任务?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆