如何在 Apache NiFi 中重新排序 CSV 列 [英] How to reorder CSV columns in Apache NiFi
问题描述
在 apache nifi 中对 csv 中的列重新排序.
Reorder column in a csv in apache nifi.
输入 - 我有多个具有相同列但顺序不同的文件.
Input - I have multiple files which have same columns but are in different order.
输出 - 抓取一些列并按相同顺序存储.
Output - Scrape some columns and store in same order.
推荐答案
就我而言,因为我确定这些列将包含在所有 CSV 文件中,所以我只需要对它们重新排序.所以我使用 QueryRecord
重新排序我的 csv 文件.
In my case, because I'm sure those columns will be included in all CSV files, I just need to reorder them. So I use QueryRecord
to reorder my csv files.
例如,这是我的 csv 文件:
For example, here're my csv files:
\\file1
name, age, location, gender
Jack, 40, TW, M
Lisa, 30, CA, F
\\file2
name, location, gender, age
Mary, JP, F, 25
Kate, DE, F, 23
我想将列重新排序为 location,name,gender,age
,我在 QueryRecord
中设置了一个名为 reorder_data
的新属性,值如下:
I'd like to reorder columns to location,name,gender,age
, I set a new property in QueryRecord
named reorder_data
, with the value like:
SELECT location,name,gender,age FROM FLOWFILE
那么flowfile中的数据就会变成:
Then data in the flowfile will become:
\\file1 - reordered
location, name, gender, age
TW, Jack, M, 40
CA, Lisa, F, 30
\\file2 - reordered
location, name, gender, age
JP, Mary, F, 25
DE, Kate, F, 23
因此,我可以从QueryRecord
中得到重新排序的数据输出以及原始数据,非常方便.
Thus, I can get reordered data output from QueryRecord
as well as original data, it's very convenient.
顺便说一句,您还可以使用组变量或属性来设置列顺序以更好地维护:
BTW, You can also use group variable or attribute to set column order for better maintenance:
//Group variable or attribute
column_order location,name,gender,age
//Property in QueryRecord
reorder_data SELECT ${column_order} FROM FLOWFILE
这篇关于如何在 Apache NiFi 中重新排序 CSV 列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!