每个制作人的Kafka主题 [英] Kafka topic per producer

查看:64
本文介绍了每个制作人的Kafka主题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我说我有多个设备.每个设备具有不同类型的传感器.现在,我想将每个传感器的每个设备的数据发送到kafka.但是我对卡夫卡话题感到困惑.用于处理此实时数据

Lets say I have multiple devices. Each device has different type of sensors. Now I want to send the data from each device for each sensor to kafka. But I am confused about the kafka topics. For processing this real time data

每个设备都具有kafka主题是否很好,并且该设备中的所有传感器都会将数据发送到特定的kafka主题,或者我应该创建一个主题,并让所有设备将数据发送到该主题.

Is it good to have kafka topic per device and all the sensors from that device will send the data to particular kafka topic, or I should create one topic and have all the devices send the data to that one topic.

如果我采用第一种情况,我们将为每个设备创建主题,

If I go with first case where we will create topic per device then,

设备1(传感器A,B,C)->主题1

Device1 (sensor A, B, C) -> topic1

设备2(传感器A,B,C)->主题2

Device2 (sensor A, B, C) -> topic2

  1. 我可以创建多少个主题?
  2. 此模型可以缩放吗?

案例2:在哪里,将数据发送到一个主题

Case 2: where, sending data to one topic

Device1(传感器A,B,C),Device2(传感器A,B,C).... DeviceN ....->主题

Device1 (sensor A, B, C), Device2 (sensor A, B, C)....DeviceN.... -> topic

  1. 这不是数据的瓶颈.由于它的行为就像来自某些传感器的队列数据一样,将被排在队列后面,并且不会被实时处理.

  1. Isn't this going to be bottleneck for data. Since it will behave as queue data from some sensor will be way behind in queue and will not be processed in real time.

此模型可以缩放吗?

编辑

可以说每个设备都与用户相关联(一对多).所以我想根据设备处理数据.因此,我要处理数据的方式是,每台设备及其传感器数据在经过一些处理后都会交给用户.

Lets say each device is associated with user (many to one). So I want to process data according to devices. So the way I want to process data is, each device and its sensor data will go to the user after some processing.

说我要关注

Device1

->传感器A-主题1分区1

-> Sensor A - Topic1 Partition 1

->传感器B-主题1分区2

-> Sensor B - Topic1 Partition 2

Device2

->传感器A-主题2分区1

-> Sensor A - Topic2 Partition 1

->传感器B-主题2分区2

-> Sensor B - Topic2 Partition 2

我想要某种发布/订阅类型的行为.由于可以添加或删除设备,因此也可以添加或删除传感器.有没有一种方法可以动态创建这些主题并进行分区.

I want some pub/sub type of behavior. Since devices can be added or removed also sensors can be added or removed. Is there a way to create these topics and partition on the fly.

如果不是kafka,哪种发布/订阅将适合这种行为.

If not kafka, what pub/sub will be suitable for this kind of behavior.

推荐答案

这取决于您的语义:

  • 主题是逻辑抽象,应包含统一"数据,即具有相同语义的数据
  • 一个主题可以通过其分区数量轻松地扩展

例如,如果您有不同类型的传感器收集不同的数据,则应为每种类型使用一个主题.

For example, if you have different type of sensors collecting different data, you should use a topic for each type.

由于可以添加或删除设备,因此也可以添加或删除传感器.有没有一种方法可以动态创建这些主题并进行分区.

Since devices can be added or removed also sensors can be added or removed. Is there a way to create these topics and partition on the fly.

如果每条消息中都嵌入了设备元数据(以区分日期从何而来),则应使用具有多个分区的单个主题进行横向扩展.可以添加新主题或分区,但必须手动完成.对于添加新分区,一个问题可能是它可能会更改您的数据分布,从而可能破坏语义.因此,最佳实践是从一开始就对主题进行过度分区,以避免添加新分区.

If device meta data (to distinguish where date comes from) is embedded in each message, you should use a single topic with many partitions to scale out. Adding new topics or partitions is possible but must be done manually. For adding new partitions, a problem might be that it might change your data distribution and thus might break semantics. Thus, best practice is to over partition your topic from the beginning on to avoid adding new partitions.

如果没有嵌入的元数据,则需要多个主题(例如,每个用户或每个设备)来区分消息来源.

If there is no embedded meta data, you would need multiple topics (eg, per user, or per device) to distinguish message origins.

作为替代,也许具有多个分区的单个主题 和从设备/传感器到分区的固定映射(通过使用自定义分区器)也可以工作.在这种情况下,添加新分区是没有问题的,因为您可以控制数据分发并保持其稳定.

As an alternative, maybe a single topic with multiple partitions and a fixed mapping from device/sensor to partition -- via using a custom partitioner -- would work, too. For this case, adding new partitions is no problem, as you control data distribution and can keep it stable.

更新

有一篇博客文章对此进行了讨论: https://www.confluent.io/blog/put-several-event-types-kafka-topic/

There is a blog post discussing this: https://www.confluent.io/blog/put-several-event-types-kafka-topic/

这篇关于每个制作人的Kafka主题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆