每个生产者的 Kafka 主题 [英] Kafka topic per producer

查看:34
本文介绍了每个生产者的 Kafka 主题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有多个设备.每个设备都有不同类型的传感器.现在我想将每个传感器的每个设备的数据发送到 kafka.但我对 kafka 主题感到困惑.用于处理这些实时数据

Lets say I have multiple devices. Each device has different type of sensors. Now I want to send the data from each device for each sensor to kafka. But I am confused about the kafka topics. For processing this real time data

每个设备都有一个 kafka 主题,并且该设备的所有传感器都将数据发送到特定的 kafka 主题,或者我应该创建一个主题并让所有设备将数据发送到该主题.

Is it good to have kafka topic per device and all the sensors from that device will send the data to particular kafka topic, or I should create one topic and have all the devices send the data to that one topic.

如果我采用第一种情况,我们将为每个设备创建主题,然后,

If I go with first case where we will create topic per device then,

设备 1(传感器 A、B、C)-> 主题 1

Device1 (sensor A, B, C) -> topic1

Device2(传感器 A、B、C)-> topic2

Device2 (sensor A, B, C) -> topic2

  1. 我可以创建多少个主题?
  2. 这个模型会扩展吗?

案例 2:向一个主题发送数据

Case 2: where, sending data to one topic

Device1 (sensor A, B, C), Device2 (sensor A, B, C)....DeviceN.... -> topic

Device1 (sensor A, B, C), Device2 (sensor A, B, C)....DeviceN.... -> topic

  1. 这不会成为数据的瓶颈吗?由于它将表现为来自某个传感器的队列数据将远远落后于队列并且不会被实时处理.

  1. Isn't this going to be bottleneck for data. Since it will behave as queue data from some sensor will be way behind in queue and will not be processed in real time.

这个模型会扩展吗?

编辑

假设每个设备都与用户相关联(多对一).所以我想根据设备处理数据.所以我想处理数据的方式是,每台设备及其传感器数据经过一些处理后都会交给用户.

Lets say each device is associated with user (many to one). So I want to process data according to devices. So the way I want to process data is, each device and its sensor data will go to the user after some processing.

说我在关注

设备 1

-> 传感器 A - Topic1 分区 1

-> Sensor A - Topic1 Partition 1

-> 传感器 B - Topic1 分区 2

-> Sensor B - Topic1 Partition 2

设备 2

-> 传感器 A - Topic2 分区 1

-> Sensor A - Topic2 Partition 1

-> 传感器 B - Topic2 分区 2

-> Sensor B - Topic2 Partition 2

我想要一些发布/订阅类型的行为.由于可以添加或删除设备,因此也可以添加或删除传感器.有没有办法动态创建这些主题和分区.

I want some pub/sub type of behavior. Since devices can be added or removed also sensors can be added or removed. Is there a way to create these topics and partition on the fly.

如果不是 kafka,什么 pub/sub 会适合这种行为.

If not kafka, what pub/sub will be suitable for this kind of behavior.

推荐答案

这取决于你的语义:

  • 主题是逻辑抽象,应该包含统一"数据,即具有相同语义含义的数据
  • 主题可以通过其分区数量轻松扩展

例如,如果您有不同类型的传感器收集不同的数据,您应该为每种类型使用一个主题.

For example, if you have different type of sensors collecting different data, you should use a topic for each type.

由于可以添加或删除设备,因此也可以添加或删除传感器.有没有办法动态创建这些主题和分区.

Since devices can be added or removed also sensors can be added or removed. Is there a way to create these topics and partition on the fly.

如果设备元数据(用于区分日期来自何处)嵌入在每条消息中,您应该使用具有多个分区的单个主题来横向扩展.添加新主题或分区是可能的,但必须手动完成.对于添加新分区,问题可能是它可能会更改您的数据分布,从而可能会破坏语义.因此,最佳做法是从一开始就对主题进行过度分区,以避免添加新分区.

If device meta data (to distinguish where date comes from) is embedded in each message, you should use a single topic with many partitions to scale out. Adding new topics or partitions is possible but must be done manually. For adding new partitions, a problem might be that it might change your data distribution and thus might break semantics. Thus, best practice is to over partition your topic from the beginning on to avoid adding new partitions.

如果没有嵌入的元数据,您将需要多个主题(例如,每个用户或每个设备)来区分消息来源.

If there is no embedded meta data, you would need multiple topics (eg, per user, or per device) to distinguish message origins.

作为替代方案,也许具有多个分区的单个主题从设备/传感器到分区的固定映射——通过使用自定义分区器——也可以工作.对于这种情况,添加新分区没有问题,因为您可以控制数据分布并保持稳定.

As an alternative, maybe a single topic with multiple partitions and a fixed mapping from device/sensor to partition -- via using a custom partitioner -- would work, too. For this case, adding new partitions is no problem, as you control data distribution and can keep it stable.

更新

有一篇博客文章讨论了这个:https://www.confluent.io/blog/put-several-event-types-kafka-topic/

There is a blog post discussing this: https://www.confluent.io/blog/put-several-event-types-kafka-topic/

这篇关于每个生产者的 Kafka 主题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆