Spark Streaming + Kafka 集成:支持新主题订阅,无需重新启动流上下文 [英] Spark Streaming + Kafka Integration : Support new topic subscriptions without requiring restart of the streaming context

查看:25
本文介绍了Spark Streaming + Kafka 集成:支持新主题订阅,无需重新启动流上下文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 spark 流应用程序 (spark 2.1) 来使用来自 kafka(0.10.1) 主题的数据.我想订阅新主题无需重新启动流上下文.有没有办法实现这一目标?

I am using a spark streaming application(spark 2.1) to consume data from kafka(0.10.1) topics.I want to subscribe to new topic without restarting the streaming context. Is there any way to achieve this?

我可以在 apache spark 项目中看到相同的 jira 票证(https://issues.apache.org/jira/browse/SPARK-10320),即使它在 2.0 版本中关闭,我也找不到任何文档或示例来执行此操作.如果你们中的任何人熟悉这一点,请为我提供文档链接或相同的示例,.提前致谢.

I can see a jira ticket in apache spark project for the same (https://issues.apache.org/jira/browse/SPARK-10320),Even though it is closed in 2.0 version, I couldn't find any documentation or example to do this. If any of you are familiar with this, please provide me documentation link or example for the same, . Thanks in advance.

推荐答案

我发现这个解决方案更适合我的目的.我们可以与不同的数据流共享一个 'StreamingContext' 实例.为了更好的管理,我们可以使用相同的流上下文为每个主题创建单独的dStream"实例,您可以将此dStream"实例存储在具有其主题名称的映射中,以便稍后您可以停止或取消订阅该特定主题.为清楚起见,请参阅下面的代码.

I found this solution more suitable for my purpose.We can share a 'StreamingContext' instance with different dstreams. For better management we can create separate 'dStream' instance for each topic using same streaming context, this 'dStream' instance you can store in a map with its topic name, so that later you can stop or unsubscribe from that particular topic. Please see the code below for clarity.

<script src="https://gist.github.com/shemeemsp7/01d21588347b94204c71a14005be8afa.js"></script>

这篇关于Spark Streaming + Kafka 集成:支持新主题订阅,无需重新启动流上下文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆