内容在边缘NiFi处理器中的群集节点之间分配 [英] Distribution of content among cluster nodes within edge NiFi processors

查看:106
本文介绍了内容在边缘NiFi处理器中的群集节点之间分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究NiFi文档.我必须同意,这是那里有据可查的开源项目之一.

I was exploring NiFi documentation. I must agree that it is one of the well documented open-source projects out there.

我的理解是处理器在集群的所有节点上运行.但是,我想知道当我们使用诸如FetchS3Object,FetchHDFS等内容提取处理器时,内容如何在群集节点之间分配.在诸如FetchHDFS或FetchSFTP之类的处理器中,所有节点都将连接到源吗?它是将内容拆分并从多个节点中获取还是一个节点获取内容并在下游队列中进行负载均衡?

My understanding is that the processor runs on all nodes of the cluster. However, I was wondering about how the content is distributed among cluster nodes when we use content pulling processors like FetchS3Object, FetchHDFS etc. In processor like FetchHDFS or FetchSFTP, will all nodes make connection to the source? Does it split the content and fetch from multiple nodes or One node fetched the content and load balance it in the downstream queues?

推荐答案

传统上,@ dagget的回答是处理这种情况的方法,通常称为列表+提取"模式.列表处理器仅在主节点上运行,列表发送到RPG以在整个集群中重新分配,输入端口接收列表,并连接到在并行获取的所有节点上运行的获取处理器.

The answer by @dagget has traditionally been the approach to handle this situation, often referred to as the "list + fetch" pattern. List processor runs on Primary Node only, listings sent to RPG to re-distribute across the cluster, input port receives listings and connect to a fetch processor running on all nodes fetching in parallel.

在1.8.0中,现在有了负载平衡连接,从而不再需要RPG.您仍将仅在主节点上运行列表处理器,然后将其直接连接到提取处理器,并在两者之间配置队列以实现负载平衡.

In 1.8.0 there are now load balanced connections which remove the need for the RPG. You would still run the List processor on Primary Node only, but then connect it directly to the Fetch processors, and configure the queue in between to load balance.

这篇关于内容在边缘NiFi处理器中的群集节点之间分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆