Apache Flink动态添加新流 [英] Apache Flink add new stream dynamically

查看：387 发布时间：2020/9/3 7:42:09 apache-flink flink-streaming

本文介绍了Apache Flink动态添加新流的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

是否可以在Apache Flink中在运行时动态添加新的数据流而无需重新启动Job?

Is it possible in Apache Flink, to add a new datastream dynamically during runtime without restarting the Job?

据我了解，通常的Flink程序如下所示:

As far as I understood, a usual Flink program looks like this:

val env = StreamExecutionEnvironment.getExecutionEnvironment()
val text = env.socketTextStream(hostname, port, "\n")
val windowCounts = text.map...

env.execute("Socket Window WordCount")

就我而言，例如启动新设备，因此必须处理另一个流.但是如何即时添加此新流?

In my case it is possible, that e.g. a new device is started and therefore another stream must be processed. But how to add this new stream on-the-fly?

推荐答案

无法在运行时向Flink程序添加新流.

It is not possible to add new streams at runtime to a Flink program.

解决此问题的方法是拥有一个包含所有传入事件的流(例如，您吸收所有单个流的Kafka主题).这些事件应该具有一个密钥，以标识事件来自哪个流.然后，可以使用此密钥keyBy流并应用每个密钥处理逻辑.

The way to solve this problem is to have a stream which contains all incoming events (e.g. a Kafka topic into which you ingest all individual streams). The events should have a key identifying from which stream they come. This key can then be used to keyBy the stream and to apply a per key processing logic.

如果要从多个套接字读取，则可以编写自己的SourceFunction，该SourceFunction从某些输入(例如从固定套接字)读取要打开其套接字的端口.然后，您可以在内部将所有这些套接字保持打开状态，并以循环方式从中读取它们.

If you want to read from multiple sockets, then you could write your own SourceFunction which reads from some input (e.g. from a fixed socket) the ports to open a socket for. Then internally you could maintain all these sockets open and read in a round robin fashion from them.

这篇关于Apache Flink动态添加新流的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Apache Flink动态添加新流 [英] Apache Flink add new stream dynamically

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Apache Flink动态添加新流 [英] Apache Flink add new stream dynamically

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭