Apache NiFi:使用映射值将列添加到csv [英] Apache NiFi: Add column to csv using mapped values

查看:108
本文介绍了Apache NiFi:使用映射值将列添加到csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用GetFile处理器将csv带入NiFi工作流程.我有一列由"id"组成.每个id表示一个特定的字符串.大约有3个ID.例如,如果我的csv由

组成

 名称,年龄,ID约翰(Y)10杰克,55,NFinn,23,C 

我知道Y表示York,N表示Old,C表示Cat.我想要一个标题为"nick"的新列,并为每个id都具有相应的昵称.

 名称,年龄,id,昵称约翰10,Y,约克Jake,55,N,OldFinn,23,C,Cat 

最后,我想要一个带有额外列和每条记录适当数据的csv .使用Apache NiFi怎么可能.请给我有关必须使用的处理器以及必须更改的配置以完成此任务的建议.

解决方案

流:

  • 添加新的昵称栏
  • 将ID复制到昵称"列
  • 查看每一行,并将id与它的对应值匹配
  • 将此值设置为刻痕"列中的当前行

您可以使用

UpdateRecord将解析csv文件,添加新列并复制ID值:

创建一个 CSVReader 并保留默认属性.创建一个 CSVRecordSetWriter 并将Schema访问策略设置为 Schema Text .将模式文本属性设置为

  {"type":记录","name":"foobar","namespace":"my.example",字段":[{"name":名称","type":"string"},{姓名年龄","type":"int"},{"name":"id","type":"string"},{"name":"nick","type":"string"}]} 

请注意,它具有新列.最后,将原始值替换为映射:

PS:我注意到您是新手,欢迎光临!您之前的任何问题都没有接受一个答案.如果他们解决了您的问题,请接受他们,因为这将帮助其他人找到解决方案.

A csv is brought into the NiFi Workflow using a GetFile Processor. I have a column consisting of a "id". Each id means a certain string. There are around 3 id's. For an example if my csv consists of

name,age,id
John,10,Y
Jake,55,N
Finn,23,C

I am aware that Y means York, N means Old and C means Cat. I want a new column with a header named "nick" and have the corresponding nick for each id.

name,age,id,nick
John,10,Y,York
Jake,55,N,Old
Finn,23,C,Cat

Finally I want a csv with the extra column and the appropriate data for each record. How is this possible Using Apache NiFi. Please advice me on the processors that must be used and the configurations that must be changed in order to accomplish this task.

解决方案

Flow:

  • add a new nick column
  • copy over the id to the nick column
  • look at each line and match id with it's corresponding value
  • set this value into current line in the nick column

You can achieve this using either ReplaceText or ReplaceTextWithMapping. I do it with ReplaceText:

UpdateRecord will parse the csv file, add the new column and copy the id value:

Create a CSVReader and keep the default properties. Create a CSVRecordSetWriter and set Schema access strategy to Schema Text. Set Schema Text property to

{
   "type":"record",
   "name":"foobar",
   "namespace":"my.example",
   "fields":[
      {
         "name":"name",
         "type":"string"
      },
      {
         "name":"age",
         "type":"int"
      },
      {
         "name":"id",
         "type":"string"
      },
      {
         "name":"nick",
         "type":"string"
      }
   ]
}

Notice that it has the new column. Finally replace the original values with the mapping:

PS: I noticed you are new to SO, welcome! You have not accepted a single answer in any of your previous questions. Accept them, if they solve your problem, as it will help others to find solutions.

这篇关于Apache NiFi:使用映射值将列添加到csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆