Apache Flink - 将流与输入 Kafka 主题一样进行分区 [英] Apache Flink - Partitioning the stream equally as the input Kafka topic

查看：27 发布时间：2021/11/12 1:16:43 apache-kafka parallel-processing apache-flink partitioning kafka-topic

本文介绍了Apache Flink - 将流与输入 Kafka 主题一样进行分区的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在 Apache Flink 中实现以下场景:

I would like to implement in Apache Flink the following scenario:

给定一个有 4 个分区的 Kafka 主题，我想在 Flink 中使用不同的逻辑独立处理分区内数据，具体取决于事件的类型.

Given a Kafka topic having 4 partitions, I would like to process the intra-partition data independently in Flink using different logics, depending on the event's type.

特别地，假设输入 Kafka 主题包含之前图像中描述的事件.每个事件都有不同的结构:分区 1 具有字段a"；作为键，分区 2 具有字段b"；作为关键等.在 Flink 中，我想根据事件应用不同的业务逻辑，所以我想我应该以某种方式拆分流.为了实现图片中描述的内容，我想只使用一个消费者来做类似的事情(我不明白为什么我应该使用更多):

In particular, suppose the input Kafka topic contains the events depicted in the previous images. Each event have a different structure: partition 1 has the field "a" as key, partition 2 has the field "b" as key, etc. In Flink I would like to apply different business logics depending on the events, so I thought I should split the stream in some way. To achieve what's described in the picture, I thought to do something like that using just one consumer (I don't see why I should use more):

FlinkKafkaConsumer<..> consumer = ...
DataStream<..> stream = flinkEnv.addSource(consumer);

stream.keyBy("a").map(new AEventMapper()).addSink(...);
stream.keyBy("b").map(new BEventMapper()).addSink(...);
stream.keyBy("c").map(new CEventMapper()).addSink(...);
stream.keyBy("d").map(new DEventMapper()).addSink(...);

(a) 正确吗? 另外，如果我想并行处理每个 Flink 分区，因为我只想按顺序处理按相同 Kafka 分区排序的事件，而不是在全球范围内考虑它们，(b) 我该怎么办?我知道方法 setParallelism() 的存在，但我不知道在这种情况下在哪里应用它.

(a) Is it correct? Also, if I would like to process each Flink partition in parallel, since I'm just interested to process in-order the events sorted by the same Kafka partition, and not considering them globally, (b) how can I do? I know the existence of the method setParallelism(), but I don't know where to apply it in this scenario.

我正在寻找有关标记为(a) 和(b) 的问题的答案.提前致谢.

I'm looking for an answer about questions marked (a) and (b). Thank you in advance.

Apache Flink - 将流与输入 Kafka 主题一样进行分区 [英] Apache Flink - Partitioning the stream equally as the input Kafka topic

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Apache Flink - 将流与输入 Kafka 主题一样进行分区 [英] Apache Flink - Partitioning the stream equally as the input Kafka topic

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭