ADFv2-如何在USQL作业中处理来自不同文件夹的多个输入文件 [英] ADFv2 - how to porcess multiple input files from different folder in a USQL job

查看:51
本文介绍了ADFv2-如何在USQL作业中处理来自不同文件夹的多个输入文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

我们需要使用Azure数据工厂和USQL处理来自不同文件夹的多个文件.

We have a requirement to process multiple files from different folders using Azure data factory and USQL. 

这是我们拥有的文件夹结构

Here is the folder structure we have

年份->月->天

Year --> Month --> Day

我们有一个用于每个日期的文件夹,例如1,2,3 ... 31.要求是从特定文件夹中读取文件,并将其传递给USQL进行分析处理.我们需要处理多个日期的数据.数据工厂中有什么方法可以从多个工厂读取数据 文件夹.

We have a folder for every date, say 1,2,3...31. The requirement is to read files from specific folders and pass it to USQL to do analytics processing. We need to process data for multiple dates. Is there any way in data factory we can read data from multiple folders. 

示例:我需要读取特定月份的日期,1,7和10的数据.我不想读取该月的所有文件

Example: I need to read data for the dates, 1,7 and 10 for a specific month. I do not want to read all the files for the month

如果您遇到了上述情况的解决方案,请告诉我们.

Please let us know if you have come across a solution for the above scenario.

我们正在考虑使用一个无服务器组件将特定的日期文件移动到新文件夹(例如暂存),然后将暂存的文件夹路径作为USQL作业的输入,但是,如果我们有更好的方法,那么我们可以避免这种不必要的文件传输 在Azure数据湖存储中.

We were thinking to have a server-less component to move the specific date files to a new folder (say staged), then give the staged folder path as an input to USQL job, however if we get a better approach then we can avoid this un-necessary file transfer within Azure data lake store.

推荐答案

Akhilesh,

Hi Akhilesh,

要从多个文件复制数据,可以使用针对每个活动以包装复制活动.考虑到文件名在复制之前是已知的,您可以 构造一个文件名数组,并将其传递给" 这里是一个例子.希望能对您有所帮助.谢谢.

To copy data from multiple files, you could use a for each activity to wrap the copy activity. Consider the filenames are known before copy, you could construct an array of the filenames and pass it to "items" property in for each activity. Here is an example. Hope it'll help. Thanks.


这篇关于ADFv2-如何在USQL作业中处理来自不同文件夹的多个输入文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆