如何在 NIFI 中的 PUTFILE 处理器上维护唯一的时间戳 [英] How to maintain unique timestamp at PUTFILE processor in NIFI

查看:33
本文介绍了如何在 NIFI 中的 PUTFILE 处理器上维护唯一的时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些 csv 数据需要放入具有特定文件命名约定的唯一文件名的位置.

I have some csv data which needs to put into a location with unique file name with specific file naming convention.

流程顺序

  1. 合并记录 ->在这里,csv 行将被合并并转发到更新属性处理器.

  1. Merge record -> At here the csv rows will get merged and gets forward to the update attribute processor.

更新属性 ->当合并的流文件内容(csv 行的集合)流经更新属性处理器时,具有以下语法的当前时间戳将被分配给文件名".流文件属性.语法:Test-${now():format("yyyyMMddHHmmssSSS", "IST")}.csv

Update Attribute -> When the merged flowfile content(collection of csv row) flows through the update attribute processor, the current timestamp with the below syntax will get assigned to the "filename" flowfile-attribute. syntax: Test-${now():format("yyyyMMddHHmmssSSS", "IST")}.csv

PutSftp ->现在在 putsftp 服务器上,无论收到表单更新属性的流文件都被发布到远程服务器.

PutSftp -> Now at putsftp server whatever the flow file received form update attribute is getting published to an remote server.

问题陈述:

我的远程服务器对文件名格式有限制,它应该具体定义如下:语法:Test-${now():format("yyyyMMddHHmmssSSS", "IST")}.csv蛋:Test-202104041836555.csv

My remote server has restriction with the file name format and it should be in specific as defined like: syntax: Test-${now():format("yyyyMMddHHmmssSSS", "IST")}.csv Egg: Test-202104041836555.csv

因此,在更新属性处理器处,由于线程在同一时间戳实例处理多个流文件,并且在放入 sftp 服务器时,csv 文件无法处理并放入远程服务器,因为已经存在同名文件.

So at the update attribute processor the multiple flowfiles (merged content) are getting assigned same filename due to thread processing multiple flowfiles at same instance of timestamp and while placing into sftp server the csv file is getting failed to process and place into the remote server as there is already existing file with same name.

注意:我与 putsftp 处理器中的冲突解决策略无关,因为我的 sftp 服务器客户端对文件命名格式有严格的依赖性.

Note: I have nothing to do with conflict resolve strategies in putsftp processor as my sftp server client has a tight dependency on the file naming format.

处理器序列流程参考图:

Image for reference of the processor sequence flow:

更新属性处理器属性:

推荐答案

我通过引入 ControrRate 处理器找到了解决方案.ControlRate 处理器的配置:

I found the solution by introduction ControrRate processor. Configuration of ControlRate processor:

通过在一个实例/时间点仅将一个流文件泵送到 udpate 属性处理器,我能够实现为每个流文件分配唯一的时间戳.

By pumping only one flowfile at a instance / point of time to the udpate attribute processor, am able to achieve assigning unique timestamp for each flow file.

更新流程图:

这篇关于如何在 NIFI 中的 PUTFILE 处理器上维护唯一的时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆