Kafka 分区和 Kafka 副本有什么区别? [英] What is the difference between Kafka partitions and Kafka replicas?

查看:31
本文介绍了Kafka 分区和 Kafka 副本有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了 3 个 Kafka 代理设置,代理 ID 为 20、21、22.然后我创建了这个主题:

I created 3 Kafka brokers setup with broker id's 20,21,22. Then I created this topic:

bin/kafka-topics.sh --zookeeper localhost:2181 \
  --create --topic zeta --partitions 4 --replication-factor 3

导致:

当生产者发送消息hello world"时主题zeta,消息首先被Kafka写入哪个分区?

When a producer sends message "hello world" to topic zeta, to which partition the message first gets written to by Kafka?

你好世界"消息被复制到所有 4 个分区中?

The "hello world" message gets replicated in all 4 partitions?

3 个 broker 中的每个 broker 都包含所有 4 个分区?在上述情况下,这与复制因子 3 有何关系?

Each broker among the 3 brokers contain all the 4 partitions? How is that related to replica factor of 3 in above context?

如果我有 8 个在他们自己的进程或线程中并行运行的消费者订阅了 zeta 主题,那么 Kafka 如何分配分区或代理来并行服务这些?

If I have 8 consumers running in their own processes or threads in parallel subscribed to zeta topic, how partitions or brokers are assigned by Kafka to serve these in parallel?

推荐答案

复制和分区是两个不同的东西.

复制 将跨集群复制相同的数据,以提高可用性/持久性.分区是 Kafka 在整个集群中分发非冗余数据的方式,它随着分区的数量而扩展.

Replication will copy identical data across the cluster for higher availability/durability. Partitions are Kafka's way to distribute non-redundant data across the cluster and it scales with the number of partitions.

当生产者发送消息hello world"时主题zeta,消息首先被Kafka写入哪个分区?

When a producer sends message "hello world" to topic zeta, to which partition the message first gets written to by Kafka?

当您发送hello world"时消息到主题,默认情况下,您的生产者应用基于该消息的键的散列算法(如 hash(key) % number_of_partitions).如果您没有提供密钥,生产者将进行循环,因此无法预测消息将发送到哪些分区.我猜如果是第一条消息,它会在分区 0 中结束.

When you send a "hello world" message to a topic, by default, your producer applies a hashing algorithm based on the key of that message (like hash(key) % number_of_partitions). In case you did not provide a key the producer will do round-robin and it is therefore not predictable to which partitions the message will be sent. I am guessing if it is the first message, it will end up in partition 0.

你好世界"消息被复制到所有 4 个分区中?

The "hello world" message gets replicated in all 4 partitions?

这一条消息将复制到您的所有副本但不会复制到 4 个分区.

This one message will get replicated across all your Replicas but not to the 4 partitions.

您将在代理 20、21、22 上找到消息.但是,每个分区都有一个领导者,负责从该分区进行的所有读取和写入操作.在您的屏幕截图中,您还可以找到每个分区的领导者的代理 ID.从分区 0Leader: 21 可以看出该分区的领导者位于 broker 21.

You will find the message on the broker 20, 21, 22. However, each partition has a leader which is responsible for all reads and writes from and to that partition. In your screenshot you can also spot the broker id of the leader of each partition. From Leader: 21 for partition 0 you can tell that the leader of that partition sits on broker 21.

3 个 broker 中的每个 broker 都包含所有 4 个分区?在上述情况下,这与复制因子 3 有何关系?

Each broker among the 3 brokers contain all the 4 partitions? How is that related to replica factor of 3 in above context?

由于您将复制因子设置为 3,同时集群中总共有 3 个代理,因此所有三个代理都包含所有四个分区.同样,分区和副本之间存在差异.你可以有一个 Kafka 集群"使用单个代理,并且主题中仍然有 20 个分区.

As you have set the replication factor to 3 while having in total 3 brokers in your cluster all three brokers contain all four partitions. Again, there is a difference between partitions and replicas. You could have a Kafka "cluster" with a single broker and still have, say, 20 partitions in the topic.

如果我有 8 个在他们自己的进程或线程中并行运行的消费者订阅了 zeta 主题,那么 Kafka 如何分配分区或代理来并行服务这些?

If I have 8 consumers running in their own processes or threads in parallel subscribed to zeta topic, how partitions or brokers are assigned by Kafka to serve these in parallel?

这取决于这 8 个消费者是否属于同一个消费者组.重要的是要知道一个分区最多可以被来自特定消费者组的一个消费者线程读取.

Here it depends if those 8 consumers belong to the same Consumer Group or not. It is important to know that one partition can be read at most by one consumer thread from a particular consumer group.

如果所有 8 个消费者都属于同一组,则其中 4 个将从一个分区(仅从分区领导者)读取,其他四个将处于空闲状态.

If all 8 consumers belong to the same group, 4 of them will read from one partition (only from the partition leader) and the other four will be idle.

这篇关于Kafka 分区和 Kafka 副本有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆