Apache NiFi:使用映射值将列添加到 csv [英] Apache NiFi: Add column to csv using mapped values

查看:28
本文介绍了Apache NiFi:使用映射值将列添加到 csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 GetFile 处理器将 csv 带入 NiFi 工作流程.我有一个由id"组成的列.每个 id 表示一个特定的字符串.大约有3个id.例如,如果我的 csv 包含

姓名、年龄、身份证约翰,10,Y杰克,55,N芬兰人,23,C

我知道 Y 表示约克,N 表示旧,C 表示猫.我想要一个标题为nick"的新列,并且每个 id 都有相应的昵称.

姓名、年龄、身份证、昵称约翰,10,Y,约克杰克,55,N,老芬兰人,23,C,猫

最后我想要一个带有额外列和每条记录的适当数据的 csv.这怎么可能使用 Apache NiFi.请建议我必须使用的处理器以及必须更改的配置才能完成此任务.

解决方案

流程:

  • 添加新的昵称列
  • 将 id 复制到昵称列
  • 查看每一行并将 id 与其对应的值进行匹配
  • 将此值设置为昵称列中的当前行

您可以使用

UpdateRecord 将解析 csv 文件,添加新列并复制 id 值:

创建一个 CSVReader 并保留默认属性.创建一个 CSVRecordSetWriter 并将 Schema 访问策略设置为 Schema Text.将架构文本属性设置为

<代码>{"类型":"记录","name":"foobar","namespace":"my.example",领域":[{"姓名":"姓名",类型":字符串"},{姓名年龄",类型":整数"},{"name":"id",类型":字符串"},{"name":"尼克",类型":字符串"}]}

注意它有新的列.最后用映射替换原始值:

PS:我注意到你是 SO 新手,欢迎!您在之前的任何问题中都没有接受一个答案.接受他们,如果他们解决了您的问题,因为这将帮助其他人找到解决方案.

A csv is brought into the NiFi Workflow using a GetFile Processor. I have a column consisting of a "id". Each id means a certain string. There are around 3 id's. For an example if my csv consists of

name,age,id
John,10,Y
Jake,55,N
Finn,23,C

I am aware that Y means York, N means Old and C means Cat. I want a new column with a header named "nick" and have the corresponding nick for each id.

name,age,id,nick
John,10,Y,York
Jake,55,N,Old
Finn,23,C,Cat

Finally I want a csv with the extra column and the appropriate data for each record. How is this possible Using Apache NiFi. Please advice me on the processors that must be used and the configurations that must be changed in order to accomplish this task.

解决方案

Flow:

  • add a new nick column
  • copy over the id to the nick column
  • look at each line and match id with it's corresponding value
  • set this value into current line in the nick column

You can achieve this using either ReplaceText or ReplaceTextWithMapping. I do it with ReplaceText:

UpdateRecord will parse the csv file, add the new column and copy the id value:

Create a CSVReader and keep the default properties. Create a CSVRecordSetWriter and set Schema access strategy to Schema Text. Set Schema Text property to

{
   "type":"record",
   "name":"foobar",
   "namespace":"my.example",
   "fields":[
      {
         "name":"name",
         "type":"string"
      },
      {
         "name":"age",
         "type":"int"
      },
      {
         "name":"id",
         "type":"string"
      },
      {
         "name":"nick",
         "type":"string"
      }
   ]
}

Notice that it has the new column. Finally replace the original values with the mapping:

PS: I noticed you are new to SO, welcome! You have not accepted a single answer in any of your previous questions. Accept them, if they solve your problem, as it will help others to find solutions.

这篇关于Apache NiFi:使用映射值将列添加到 csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆