使用 Apache NiFi 按列内容写入 CSV 文件 [英] Using Apache NiFi to write CSV files by contents of column
问题描述
我有一个 Apache NiFi 流,我在其中读取了一个巨大的 .csv
文件.这是一个示例 .csv
:
I have an Apache NiFi flow, where I read in a massive .csv
file. Here's a sample .csv
:
school, date, city
Vanderbilt, xxxx, xxxx
Georgetown, xxxx, xxxx
Duke, xxxx, xxxx
Vanderbilt, xxxx, xxxx
我想用NiFi读取文件,然后输出另一个.csv
文件,按school
名称.IE.将有一个 .csv
文件,其中包含两条 Vanderbilt
记录(总共两行,b/c 两条记录),以及一个 Georgetown
文件,和 Duke
的一个文件.
I want to use NiFi to read the file, and then output another .csv
file by school
name. I.e. there would be a .csv
file of two Vanderbilt
records (two lines total, b/c two records), and one file for Georgetown
, and one file for Duke
.
我使用 GetFile
在我的文件中绘制(有效,已验证),然后使用 SplitText
(行拆分计数 = 1 & 标题行计数 = 1),然后是 ExtractText
,但我在那个配置中有一个非常错误的配置.最后,我有 PutFile
,它会写入我需要它去的地方.谢谢.
I've used GetFile
to draw in my file (works, verified), and then SplitText
(line split count = 1 & header line count = 1), and then ExtractText
, but I have a very wrong config in that one. Lastly, I have PutFile
, which writes to where I need it to go. Thanks.
推荐答案
看看 NiFi 的记录处理能力,你会想使用 PartitionRecord 对学校字段进行分区,这将产生你所描述的内容.
Take a look at NiFi's record processing capabilities, you will want to use PartitionRecord to partition on the school field, which will produce exactly what you are describing.
>
这篇关于使用 Apache NiFi 按列内容写入 CSV 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!