使用Apache NiFi按列内容写入CSV文件 [英] Using Apache NiFi to write CSV files by contents of column

查看:182
本文介绍了使用Apache NiFi按列内容写入CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Apache NiFi流程,在其中读取了一个大型的.csv文件.这是.csv示例:

I have an Apache NiFi flow, where I read in a massive .csv file. Here's a sample .csv:

school, date, city
Vanderbilt, xxxx, xxxx
Georgetown, xxxx, xxxx
Duke, xxxx, xxxx
Vanderbilt, xxxx, xxxx

我想使用NiFi读取文件,然后通过school名称输出另一个.csv文件. IE.将有一个.csv文件,其中包含两个Vanderbilt记录(总共两行,b/c两个记录),一个用于Georgetown的文件,一个用于Duke的文件.

I want to use NiFi to read the file, and then output another .csv file by school name. I.e. there would be a .csv file of two Vanderbilt records (two lines total, b/c two records), and one file for Georgetown, and one file for Duke.

我已经使用GetFile来绘制我的文件(工程,已验证),然后使用SplitText(行拆分计数= 1和标题行计数= 1),然后使用ExtractText,但是我有那是一个非常错误的配置.最后,我有PutFile,它写到我需要去的地方.谢谢.

I've used GetFile to draw in my file (works, verified), and then SplitText (line split count = 1 & header line count = 1), and then ExtractText, but I have a very wrong config in that one. Lastly, I have PutFile, which writes to where I need it to go. Thanks.

推荐答案

看看NiFi的记录处理功能,您将需要使用PartitionRecord对学校领域进行分区,这将产生您所描述的内容.

Take a look at NiFi's record processing capabilities, you will want to use PartitionRecord to partition on the school field, which will produce exactly what you are describing.

查看全文

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆