主题,分区和键 [英] Topics, partitions and keys

查看:90
本文介绍了主题,分区和键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找有关此主题的说明. 在Kafka文档中,我发现了以下内容:

I am looking for some clarification on the subject. In Kafka documentations I found the following:

Kafka仅按分区中的消息(而不是主题中不同分区之间的消息)提供总顺序.对于大多数应用程序,按分区排序以及按键对数据进行分区的能力就足够了.但是,如果您需要对消息进行总订购,则可以使用仅具有一个分区的主题来实现,尽管这将意味着每个使用者组只有一个使用者进程.

Kafka only provides a total order over messages within a partition, not between different partitions in a topic. Per-partition ordering combined with the ability to partition data by key is sufficient for most applications. However, if you require a total order over messages this can be achieved with a topic that has only one partition, though this will mean only one consumer process per consumer group.

这是我的问题:

  1. 这是否意味着如果我要拥有一个以上的消费者(来自同一组),并且从一个主题中读取内容,我就需要拥有一个以上的分区?

  1. Does it mean if i want to have more than 1 consumer (from the same group) reading from one topic I need to have more than 1 partition?

这是否意味着我需要与相同组的消费者数量相同的分区数量?

Does it mean I need same amount of partitions as amount of consumers for the same group?

一个分区可以读取多少消费者?

How many consumers can read from one partition?

关于API,键和分区之间的关系也有一些疑问.我只看过.net API(尤其是MS中的一个),但看起来像是模拟Java API. 我看到使用生产者向主题发送消息时,有一个关键参数.但是,当消费者从某个主题中读取内容时,就会有一个分区号.

Also have some questions regarding relationship between keys and partitions with regard to API. I only looked at .net APIs (especially one from MS) but looks like the mimic Java API. I see when using a producer to send a message to a topic there is a key parameter. But when consumer reads from a topic there is a partition number.

  1. 如何对分区编号?从0还是1开始?
  2. 键和分区之间到底有什么关系? 据我了解,键上的某些功能将确定分区.正确吗?
  3. 如果我在一个主题中有2个分区,并且希望某些特定消息转到一个分区,而其他消息转到另一个分区,我应该对一个特定分区使用特定的密钥,而对另一特定分区使用其余的密钥吗?
  4. 如果我有3个分区并将一种类型的消息发送到一个特定的分区,其余的发送给另外2个消息该怎么办?
  5. 一般来说,我如何将消息发送到特定分区,以便了解从何处读取的消费者? 还是我最好选择多个主题?
  1. How are partitions numbered? Starting from 0 or 1?
  2. What exactly relationship between a key and partition? As I understand some function on key will determine a partition. is that correct?
  3. If I have 2 partitions in a topic and want some particular messages go to one partition and other messages go to another I should use a specific key for one specific partition, and the rest for another?
  4. What if I have 3 partitions and one type of messages to one particular partition and the rest to other 2?
  5. How in general I send messages to a particular partition in order to know for a consumer from where to read? Or I better off with multiple topics?

谢谢.

推荐答案

Igor,

分区增加了Kafka主题的并行度.任何数量的消费者/生产者都可以使用相同的分区.由应用层定义协议. Kafka保证交货.关于API,您可能需要查看Java文档,因为它们可能更完整.根据我的经验:

Partitions increase parallelism of Kafka topic. Any number of consumers/producers can use the same partition. Its up to application layer to define the protocol. Kafka guarantees delivery. Regarding the API, you may want to look at Java docs as they may be more complete. Based on my experience:

  1. 分区从0开始
  2. 键可用于将消息发送到同一分区.例如hash(key)%num_partition.该逻辑可插入到Producer. https://kafka.apache .org/090/javadoc/index.html?org/apache/kafka/clients/producer/Partitioner.html
  3. 是的.但请注意不要以会导致专用"分区的某些键结尾.为此,您可能需要专门的主题.例如,控制主题和数据主题
  4. 这似乎是与3相同的问题.
  5. 我相信消费者不应该基于分区来假设数据.典型的方法是拥有可以从一个主题的多个分区读取的使用者组.如果您想拥有专用渠道,最好使用单独的主题(更安全/可维护).
  1. Partitions start from 0
  2. Keys may be used to send messages to the same partition. For example hash(key)%num_partition. The logic is pluggable to Producer. https://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/Partitioner.html
  3. Yes. but be careful not to end up with some key that will result in the "dedicated" partition. For this, you may want to have dedicated topic. For example, control topic and data topic
  4. This seems to be the same question as 3.
  5. I believe consumers should not make assumptions of the data based on partition. The typical approach is to have consumer group that can read from multiple partitions of a topic. If you want to have dedicated channels, it is better (safer/maintainable) to use separate topics.

这篇关于主题,分区和键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆