在Azure数据工厂中将固定宽度文件转换为定界文件-映射数据流 [英] Converting Fixed width file to delimited file in Azure Data Factory - Mapping Data Flow
问题描述
我在ADLS中有十个文件(.txt).我在Azure SQL数据库中有他们的元数据.我的元数据看起来像这样:
我正在尝试使用Mapping Data流将固定宽度的文件转换为带有标头的定界文件.Microsoft对此主题的唯一参考是在数据预览"标签中,我们可以看到以下信息:
replace(replace(FileName,'/',''),'.txt','')
替换冗余字符串. substring(toString({_ col0 _}),startPosition,data_length)
根据元数据拆分字符串. 希望我的回答对您有所帮助.
I have ten files (.txt) in ADLS. I have their metadata in Azure SQL db. My Metadata looks like this :
I am trying to convert the fixed width file into delimited files with header using Mapping Data flow. Only reference from Microsoft on this topic is https://docs.microsoft.com/en-us/azure/data-factory/how-to-fixed-width.
But I have multiple files with varying number of columns. Is there any way I can pass this metadata from table to Derived columns transformation. I know it is easily achievable with Databricks. But I have to do this with Dataflow.
Any references or pointers will be really helpful.
Thank you.
Yes, you can pass this metadata from table to Derived columns transformation.
Please follow my steps:
- I linked the DataSource to my ADLS's test file folder which contains T1.txt,T2.txt and T3.txt. It's important to give a name to the Column to store file name, so that we can get the file metadata(Where the data comes from). In the Data preview tab, we can see the info:
- Then I use the expression
replace(replace(FileName,'/',''),'.txt','')
to replace the redundant strings. - In the SQLSource, I created a table following your example.
- Then I joined the two sources.
- I use the expression
substring(toString({_col0_}), startPosition, data_length)
to split the string according to metadata. - The result shows:
Hope my answer is helpful to you.
这篇关于在Azure数据工厂中将固定宽度文件转换为定界文件-映射数据流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!