Apache NiFi:使用映射值将列添加到 csv [英] Apache NiFi: Add column to csv using mapped values
问题描述
使用 GetFile 处理器将 csv 带入 NiFi 工作流程.我有一个由id"组成的列.每个 id 表示一个特定的字符串.大约有3个id.例如,如果我的 csv 包含
姓名、年龄、身份证约翰,10,Y杰克,55,N芬兰人,23,C
我知道 Y 表示约克,N 表示旧,C 表示猫.我想要一个标题为nick"的新列,并且每个 id 都有相应的昵称.
姓名、年龄、身份证、昵称约翰,10,Y,约克杰克,55,N,老芬兰人,23,C,猫
最后我想要一个带有额外列和每条记录的适当数据的 csv.这怎么可能使用 Apache NiFi.请建议我必须使用的处理器以及必须更改的配置才能完成此任务.
流程:
- 添加新的昵称列
- 将 id 复制到昵称列
- 查看每一行并将 id 与其对应的值进行匹配
- 将此值设置为昵称列中的当前行
您可以使用
UpdateRecord 将解析 csv 文件,添加新列并复制 id 值:
创建一个 CSVReader
并保留默认属性.创建一个 CSVRecordSetWriter
并将 Schema 访问策略设置为 Schema Text
.将架构文本属性设置为
<代码>{"类型":"记录","name":"foobar","namespace":"my.example",领域":[{"姓名":"姓名",类型":字符串"},{姓名年龄",类型":整数"},{"name":"id",类型":字符串"},{"name":"尼克",类型":字符串"}]}
注意它有新的列.最后用映射替换原始值:
PS:我注意到你是 SO 新手,欢迎!您在之前的任何问题中都没有接受一个答案.接受他们,如果他们解决了您的问题,因为这将帮助其他人找到解决方案.
A csv is brought into the NiFi Workflow using a GetFile Processor. I have a column consisting of a "id". Each id means a certain string. There are around 3 id's. For an example if my csv consists of
name,age,id
John,10,Y
Jake,55,N
Finn,23,C
I am aware that Y means York, N means Old and C means Cat. I want a new column with a header named "nick" and have the corresponding nick for each id.
name,age,id,nick
John,10,Y,York
Jake,55,N,Old
Finn,23,C,Cat
Finally I want a csv with the extra column and the appropriate data for each record. How is this possible Using Apache NiFi. Please advice me on the processors that must be used and the configurations that must be changed in order to accomplish this task.
Flow:
- add a new nick column
- copy over the id to the nick column
- look at each line and match id with it's corresponding value
- set this value into current line in the nick column
You can achieve this using either ReplaceText or ReplaceTextWithMapping. I do it with ReplaceText:
UpdateRecord will parse the csv file, add the new column and copy the id value:
Create a CSVReader
and keep the default properties. Create a CSVRecordSetWriter
and set Schema access strategy to Schema Text
. Set Schema Text property to
{
"type":"record",
"name":"foobar",
"namespace":"my.example",
"fields":[
{
"name":"name",
"type":"string"
},
{
"name":"age",
"type":"int"
},
{
"name":"id",
"type":"string"
},
{
"name":"nick",
"type":"string"
}
]
}
Notice that it has the new column. Finally replace the original values with the mapping:
PS: I noticed you are new to SO, welcome! You have not accepted a single answer in any of your previous questions. Accept them, if they solve your problem, as it will help others to find solutions.
这篇关于Apache NiFi:使用映射值将列添加到 csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!