在 Storm 中配置并行性 [英] Configuring parallelism in Storm

查看:26
本文介绍了在 Storm 中配置并行性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Apache Storm 的新手,我正在尝试自己思考如何配置 Storm 并行性.所以有一篇很棒的文章Understanding the Parallelism of风暴拓扑",但它只会引起问题.

I am new to Apache Storm, and I am trying to figure for myself about configuring storm parallelism. So there is a great article "Understanding the Parallelism of a Storm Topology", but it only arouses questions.

当您有一个多节点风暴集群时,每个拓扑都根据 TOPOLOGY_WORKERS 配置参数作为一个整体分布.因此,如果您有 5 个工人,那么您就有 5 个 spout 副本(每个工人 1 个),而螺栓也是如此.

When you have a multinode storm cluster each topology is distributed as a whole according to TOPOLOGY_WORKERS configuration parameter. So if you have 5 workers, then you have 5 copies of spout (1 per worker), and the same thing is with bolts.

如何在storm集群内部处理这样的情况(最好不创建外部服务):

How to deal with situation like this inside a storm cluster (preferably without creating external services):

  1. 我只需要一个供所有拓扑实例使用的 spout,例如,如果输入数据通过网络文件夹推送到集群,并扫描新文件.
  2. 混凝土类型的螺栓存在类似问题.例如,当数据由锁定到具体物理机器的授权第三方库处理时.

推荐答案

一、基础:

  1. Workers - 运行 executors,每个 worker 都有自己的 JVM
  2. Executors - 运行任务,每个 executor 被风暴分配到不同的 worker 中
  3. 任务 - 运行您的 spout/bolt 代码的实例

第二,更正……拥有 5 个工人并不意味着您将自动拥有 5 个喷口副本.拥有 5 个 worker 意味着你有 5 个独立的 JVM,storm 可以在其中分配执行器运行(将其视为 5 个存储桶).

Second, a correction... having 5 workers does NOT mean you will automatically have 5 copies of your spout. Having 5 workers means you have 5 separate JVMs where storm can assign executors to run (think of this as 5 buckets).

在您第一次创建和提交拓扑时配置了 spout 的实例数:

The number of instances of your spout is configured when you first create and submit your topology:

TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("0-spout", new MySpout(), spoutParallelism).setNumTasks(spoutTasks);

由于您只需要一个 spout 用于整个集群,您可以将 spoutParallelismspoutTasks 都设置为 1.

Since you want only one spout for the entire cluster, you'd set both spoutParallelism and spoutTasks to 1.

这篇关于在 Storm 中配置并行性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆