Azure数据工厂-如何使用ForEach循环遍历CSV文件中的记录 [英] Azure Data Factory - How do I iterate through records in a CSV file using a ForEach loop

查看：47 发布时间：2021/4/13 19:52:34 azure csv azure-data-factory azure-data-flow

本文介绍了Azure数据工厂-如何使用ForEach循环遍历CSV文件中的记录的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我要实现的目标:

我有一个包含以下各列的CSV(FlattenedListDocument.csv)

DocumentKey，DocumentName

DocumentKey, DocumentName

示例值如下(此CSV文件中大约有240,000行):

Sample values are as follows (there are approximately 240,000 rows in this CSV file):

12212，银河系旅行者指南12233，MoneyBall

12212, Hitch Hikers Guide to the Galaxy 12233, MoneyBall

我必须为CSV文件中的每一行创建一个JSON文件，该文件将由另一个实用程序使用(每行一个文件).
我对如何将CSV表中的值推送到ForEach活动以使其遍历CSV文件进行迭代循环感到困惑.

推荐答案

这是在Data Factory中解决的一个非常有趣的问题.我看到的唯一选择是拥有一个带有接收器分区的数据流，该数据流根据派生列输出文件.

This is a really interesting problem to solve in Data Factory. The only option I see is to have a Data Flow with a Sink partition that outputs files based on a Derived Column.

创建一个派生列，以生成唯一的blob名称.确保包含文件夹路径:

在接收器中的设置"下，更改文件名选项".更改为作为列中的数据"，然后选择您在步骤1中创建的FileName列:

可选，但在映射"下的接收器中，删除文件名"列:

完成后，您应该将其保存在Blob存储中:

当然，要注意的是文件名必须是唯一的，因此我基于您示例的第一列(我将其命名为"Id").我不知道280K文件的性能如何，但这应该可以得到想要的结果.

The caveat, of course, is that the file name needs to be unique, so I based it on the first column in your sample (which I named "Id"). I have no idea what performance will be like with 280K files, but this should get the result that you want.

这篇关于Azure数据工厂-如何使用ForEach循环遍历CSV文件中的记录的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Azure数据工厂-如何使用ForEach循环遍历CSV文件中的记录 [英] Azure Data Factory - How do I iterate through records in a CSV file using a ForEach loop

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Azure数据工厂-如何使用ForEach循环遍历CSV文件中的记录 [英] Azure Data Factory - How do I iterate through records in a CSV file using a ForEach loop

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭