如何在 Apache NiFi 中重新排序 CSV 列 [英] How to reorder CSV columns in Apache NiFi

查看:32
本文介绍了如何在 Apache NiFi 中重新排序 CSV 列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 apache nifi 中对 csv 中的列重新排序.

Reorder column in a csv in apache nifi.

输入 - 我有多个具有相同列但顺序不同的文件.

Input - I have multiple files which have same columns but are in different order.

输出 - 抓取一些列并按相同顺序存储.

Output - Scrape some columns and store in same order.

推荐答案

就我而言,因为我确定这些列将包含在所有 CSV 文件中,所以我只需要对它们重新排序.所以我使用 QueryRecord 重新排序我的 csv 文件.

In my case, because I'm sure those columns will be included in all CSV files, I just need to reorder them. So I use QueryRecord to reorder my csv files.

例如,这是我的 csv 文件:

For example, here're my csv files:

\\file1
name, age, location, gender
Jack, 40, TW, M
Lisa, 30, CA, F 

\\file2
name, location, gender, age
Mary, JP, F, 25
Kate, DE, F, 23

我想将列重新排序为 location,name,gender,age,我在 QueryRecord 中设置了一个名为 reorder_data 的新属性,值如下:

I'd like to reorder columns to location,name,gender,age, I set a new property in QueryRecord named reorder_data, with the value like:

SELECT location,name,gender,age FROM FLOWFILE

那么flowfile中的数据就会变成:

Then data in the flowfile will become:

\\file1 - reordered
location, name, gender, age
TW, Jack, M, 40
CA, Lisa, F, 30

\\file2 - reordered
location, name, gender, age
JP, Mary, F, 25
DE, Kate, F, 23

因此,我可以从QueryRecord 中得到重新排序的数据输出以及原始数据,非常方便.

Thus, I can get reordered data output from QueryRecord as well as original data, it's very convenient.

顺便说一句,您还可以使用组变量或属性来设置列顺序以更好地维护:

BTW, You can also use group variable or attribute to set column order for better maintenance:

//Group variable or attribute
column_order   location,name,gender,age

//Property in QueryRecord
reorder_data   SELECT ${column_order} FROM FLOWFILE

这篇关于如何在 Apache NiFi 中重新排序 CSV 列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆