Apache Flink-将流等同于输入Kafka主题进行分区 [英] Apache Flink - Partitioning the stream equally as the input Kafka topic

查看：68 发布时间：2021/4/8 18:36:51 apache-kafka parallel-processing apache-flink partitioning kafka-topic

本文介绍了Apache Flink-将流等同于输入Kafka主题进行分区的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在Apache Flink中实现以下场景:

I would like to implement in Apache Flink the following scenario:

鉴于Kafka主题有4个分区，我想根据事件的类型使用不同的逻辑在Flink中独立处理分区内数据.

Given a Kafka topic having 4 partitions, I would like to process the intra-partition data independently in Flink using different logics, depending on the event's type.

尤其是，假设输入的Kafka主题包含先前图像中描述的事件.每个事件具有不同的结构:分区1具有字段" a ".作为键，分区2具有字段" b ".在Flink中，我想根据事件应用不同的业务逻辑，所以我认为我应该以某种方式拆分流.为了实现图片中描述的内容，我想只使用一个消费者来做类似的事情(我不明白为什么我应该使用更多):

In particular, suppose the input Kafka topic contains the events depicted in the previous images. Each event have a different structure: partition 1 has the field "a" as key, partition 2 has the field "b" as key, etc. In Flink I would like to apply different business logics depending on the events, so I thought I should split the stream in some way. To achieve what's described in the picture, I thought to do something like that using just one consumer (I don't see why I should use more):

FlinkKafkaConsumer<..> consumer = ...
DataStream<..> stream = flinkEnv.addSource(consumer);

stream.keyBy("a").map(new AEventMapper()).addSink(...);
stream.keyBy("b").map(new BEventMapper()).addSink(...);
stream.keyBy("c").map(new CEventMapper()).addSink(...);
stream.keyBy("d").map(new DEventMapper()).addSink(...);

(a)正确吗?另外，如果我想并行处理每个Flink分区，因为我只想按顺序处理由同一Kafka分区排序的事件，而不是全局考虑它们，(b)我该怎么做?我知道方法 setParallelism()的存在，但是我不知道在这种情况下该方法的应用范围.

(a) Is it correct? Also, if I would like to process each Flink partition in parallel, since I'm just interested to process in-order the events sorted by the same Kafka partition, and not considering them globally, (b) how can I do? I know the existence of the method setParallelism(), but I don't know where to apply it in this scenario.

我正在寻找有关标记为(a)和(b)的问题的答案.预先谢谢你.

I'm looking for an answer about questions marked (a) and (b). Thank you in advance.

Apache Flink-将流等同于输入Kafka主题进行分区 [英] Apache Flink - Partitioning the stream equally as the input Kafka topic

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Apache Flink-将流等同于输入Kafka主题进行分区 [英] Apache Flink - Partitioning the stream equally as the input Kafka topic

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭