Spark Streaming + Kafka集成:支持新主题订阅,而无需重新启动流上下文 [英] Spark Streaming + Kafka Integration : Support new topic subscriptions without requiring restart of the streaming context

查看:25
本文介绍了Spark Streaming + Kafka集成:支持新主题订阅,而无需重新启动流上下文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用火花流应用程序(火花2.1)来使用 kafka(0.10.1)主题中的数据.我想订阅新主题而无需重新启动流式传输上下文.有什么办法可以做到这一点?

I am using a spark streaming application(spark 2.1) to consume data from kafka(0.10.1) topics.I want to subscribe to new topic without restarting the streaming context. Is there any way to achieve this?

我可以在apache spark项目中看到同样的吉拉票( https://issues.apache.org/jira/browse/SPARK-10320),即使它已在2.0版本中关闭,我也找不到任何文档或示例来执行此操作.如果您对此有所熟悉,请提供给我相关的文档链接或示例.提前致谢.

I can see a jira ticket in apache spark project for the same (https://issues.apache.org/jira/browse/SPARK-10320),Even though it is closed in 2.0 version, I couldn't find any documentation or example to do this. If any of you are familiar with this, please provide me documentation link or example for the same, . Thanks in advance.

推荐答案

我发现此解决方案更适合我的目的.我们可以与不同的dstream共享"StreamingContext"实例.为了更好地管理,我们可以使用相同的流上下文为每个主题创建单独的"dStream"实例,您可以将其"dStream"实例及其主题名称存储在地图中,以便以后可以停止或取消订阅该特定主题.为了清楚起见,请参见下面的代码.

I found this solution more suitable for my purpose.We can share a 'StreamingContext' instance with different dstreams. For better management we can create separate 'dStream' instance for each topic using same streaming context, this 'dStream' instance you can store in a map with its topic name, so that later you can stop or unsubscribe from that particular topic. Please see the code below for clarity.

<script src="https://gist.github.com/shemeemsp7/01d21588347b94204c71a14005be8afa.js"></script>

这篇关于Spark Streaming + Kafka集成:支持新主题订阅,而无需重新启动流上下文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆