提取filname并将名称存储在csv文件的新列中 [英] Extract filname and store the name in a new column in csv file

查看:112
本文介绍了提取filname并将名称存储在csv文件的新列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想提取文件名并将文件名存储在CSV文件的现有列之一中.这该怎么做?使用哪个处理器?什么配置? 例如,我有一个文件名"FE_CHRGRSIM_20171207150616_CustRec.csv",我想提取"FE_CHRGRSIM_20171207150616"并将此值存储在Same CSV文件中的现有列下.请帮忙. TIA

I want to extract filename and store the filename in one of the existing column in the CSV file. How to do this? Which processor to use? what configuration? Ex- i have a filename 'FE_CHRGRSIM_20171207150616_CustRec.csv' and i want to extract ''FE_CHRGRSIM_20171207150616' and store this value under an existing column in the Same CSV file. Please help. TIA

推荐答案

通常,真实"文件名可以用作流文件中名为文件名"的属性.您可以将UpdateRecord与文字价值"的替换策略一起使用;添加一个名为/filename的用户定义属性,并将其值设置为${filename:substringBeforeLast('.')}.您需要确保将文件名"字段添加到架构中(通过UpdateRecord或手动添加).如果您不提前知道自己的CSV模式,则可以使用InferAvroSchema,它将尝试找出它.

Usually the "real" file name is available as an attribute on the flow file called "filename". You can use UpdateRecord with a Replacement Strategy of "Literal Value"; add a user-defined property called /filename and set the value to ${filename:substringBeforeLast('.')}. You'll need to make sure that the "filename" field is added to your schema (either by UpdateRecord or manually). If you won't know your CSV schema ahead of time you can use InferAvroSchema and it will try to figure it out.

如果UpdateRecord和架构东西似乎对您不起作用,则另一种方法(因为它是CSV)是使用ReplaceText,匹配整行,然后替换为该值,后跟,${filename:substringBeforeLast('.')}.那应该将文件名(扩展名被删除)添加为传出CSV的最后一列.

If UpdateRecord and the schema stuff doesn't seem to be working for you, an alternative (since it's CSV) is to use ReplaceText, match the entire line, then replace with that value followed by ,${filename:substringBeforeLast('.')}. That should add the filename (with extension removed) as the last column in the outgoing CSV.

这篇关于提取filname并将名称存储在csv文件的新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆