如何使我的风暴拓扑实时工作? [英] how to make my storm topology to work real time?

查看:18
本文介绍了如何使我的风暴拓扑实时工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个简单的程序来读取文件并生成一个文件.它运行良好.我担心如何使它成为实时拓扑.我想如果我修改源文件意味着添加了一个新记录,它应该进来我的目标文件如何在不将拓扑重新部署到集群上的情况下执行此操作.我还需要配置什么才能实现此行为.以下是本地提交拓扑的代码:-

I have created a simple program to read from file and generate a file.Its working perfectly.I am worrying about how to make it real time topology.I want if i modify source file means added a new record it should come in my target file how i will do it without redeploying my topology on cluster.What else i need to configure to achieve this behavior.Below is code of submitting topology locally:-

Config conf= new Config();
        conf.setDebug(false);
        conf.put(Config.TOPOLOGY_MAX_SPOUT_PENDING,1);
        TopologyBuilder builder = new TopologyBuilder();



            builder.setSpout("file-reader",new FileReaderSpout(args[0]));
            builder.setBolt("file-writer",new WriteToFileBolt(args[0])).shuffleGrouping("file-reader");
             LocalCluster cluster= new LocalCluster();
                cluster.submitTopology("File-To-File",conf,builder.createTopology());
                Thread.sleep(10000);
                cluster.shutdown();

推荐答案

您可能可以做的是使用与您的 Storm 集群集成的消息队列.Kafka 可能是一个非常好的候选者.它基本上是一个发布-订阅消息系统.有生产者负责将消息添加到队列,而另一端的消费者则负责检索相同的消息.

What you can probably do is use a message queue integrated with your storm cluster. Kafka could be a very nice candidate for this. It basically a publish-subscribed message system. There are producers responsible for adding messages to the queue and consumer on the other end to retrieve the same.

因此,如果您在生产者向队列发送/发布消息后立即将 Kafka 与 Storm 集成,它将可用于您的 Storm 拓扑.有一种叫做 KafkaSpout 的东西,它是一个普通的 spout 实现,能够从 Kafka 队列读取.

So if you integrate Kafka with storm as soon as your producer send/published a message to the queue it will be available to your storm topology. There is something called KafkaSpout which is a normal spout implementation capable of reading from a Kafka Queue.

因此,您的拓扑结构从 KafaSpout(订阅特定主题)开始,并在收到任何内容后立即发出,然后将输出链接到相应的 Bolt.

So it goes like this your topology starts with a KafaSpout (subscribed to a particular topic) and emitting as soon as it receives anything and then chain the output to your corresponding bolts.

您还可以寻找 Kestrel 作为 Kafka 的替代品.您应该根据具体解决您的目的进行选择.

You can also look for Kestrel as an alternative to Kafka . You should select based on what exactly solve your purpose.

这篇关于如何使我的风暴拓扑实时工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆