在Apache NiFi的ExecuteScript处理器中缓存文件内容 [英] Caching file content inside ExecuteScript processor of Apache NiFi
问题描述
我有一个 ExecuteScript 处理器,该处理器针对schematron进行了XML流文件验证.我希望将schematron文件的内容缓存在某个位置,而不是一次又一次从磁盘读取每个流文件的内容.
I have an ExecuteScript processor that does an XML flow file validation against schematron. I'd like the content of the schematron file to be cached somewhere rather than read from the disk for every flow file again and again.
执行此操作的最佳选择是什么?我是否还需要另一个脚本,将schematron的内容放入 context.stateManager 或 PutDistributedMapCache 或什么?
What is the best option for doing this? Do I need yet another script that puts the content of the schematron into context.stateManager or PutDistributedMapCache or what?
推荐答案
在groovy
脚本中,可以使用静态变量声明类,因此它们肯定会在处理器启动后保持状态.
In groovy
script there is a possibility to declare class with static variables, so they definitely will keep status after processor started.
此外,要管理这些静态变量的初始化,您可以使用ExecuteGroovyScript
处理器的功能来拦截处理器的启动和停止.
Additionally, to manage initialization of those static variables you could use the feature of ExecuteGroovyScript
processor to intercept processor start and stop.
在下面的示例中,我将比较流文件内容和磁盘上的某些文件,因为我对schematron不熟悉.
In following example I'm going to compare flow-file content to some file on disk because I'm not familiar to schematron.
import org.apache.nifi.processor.ProcessContext
class Cache {
static String validatorText = null
}
//this function called on processor start, so you can't use flow file in it
static void onStart(ProcessContext context){
//init cached(static) variable from file
Cache.validatorText = new File('/path/to/validator.txt').getText('UTF-8')
println "onStart ${context}"
}
//process flow file and compare it to `Cache.validatorText`
def ff=session.get()
if(!ff)return
def ffText = ff.read().getText("UTF-8")
assert ffText = Cache.validatorText
REL_SUCCESS << ff
注意:您可以设置
Failure strategy
=transfer to failure
. 在这种情况下,任何错误(包括断言失败)的流文件都将被重定向到REL_FAILURE,而无需其他代码.
Note: you could set
Failure strategy
=transfer to failure
. In this case on any error (including assertion failure) flow file will be redirected to REL_FAILURE without additional code.
这篇关于在Apache NiFi的ExecuteScript处理器中缓存文件内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!