提取文件名并将名称存储在 csv 文件的新列中 [英] Extract filname and store the name in a new column in csv file

查看:32
本文介绍了提取文件名并将名称存储在 csv 文件的新列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想提取文件名并将文件名存储在 CSV 文件的现有列之一中.这该怎么做?使用哪种处理器?什么配置?例如,我有一个文件名FE_CHRGRSIM_20171207150616_CustRec.csv",我想提取FE_CHRGRSIM_20171207150616"并将此值存储在同一 CSV 文件中的现有列下.请帮忙.TIA

I want to extract filename and store the filename in one of the existing column in the CSV file. How to do this? Which processor to use? what configuration? Ex- i have a filename 'FE_CHRGRSIM_20171207150616_CustRec.csv' and i want to extract ''FE_CHRGRSIM_20171207150616' and store this value under an existing column in the Same CSV file. Please help. TIA

推荐答案

通常真实"文件名可用作名为filename"的流文件的属性.您可以将 UpdateRecord 与文字值"的替换策略一起使用;添加名为 /filename 的用户定义属性,并将值设置为 ${filename:substringBeforeLast('.')}.您需要确保将文件名"字段添加到您的架构中(通过 UpdateRecord 或手动添加).如果您不知道您的 CSV 架构提前,您可以使用 InferAvroSchema,它会尝试弄清楚.

Usually the "real" file name is available as an attribute on the flow file called "filename". You can use UpdateRecord with a Replacement Strategy of "Literal Value"; add a user-defined property called /filename and set the value to ${filename:substringBeforeLast('.')}. You'll need to make sure that the "filename" field is added to your schema (either by UpdateRecord or manually). If you won't know your CSV schema ahead of time you can use InferAvroSchema and it will try to figure it out.

如果 UpdateRecord 和架构内容似乎对您不起作用,另一种方法(因为它是 CSV)是使用 ReplaceText,匹配整行,然后用该值替换,后跟 ,${filename:substringBeforeLast('.')}.这应该添加文件名(删除扩展名)作为传出 CSV 中的最后一列.

If UpdateRecord and the schema stuff doesn't seem to be working for you, an alternative (since it's CSV) is to use ReplaceText, match the entire line, then replace with that value followed by ,${filename:substringBeforeLast('.')}. That should add the filename (with extension removed) as the last column in the outgoing CSV.

这篇关于提取文件名并将名称存储在 csv 文件的新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆