从Azure中的Data Factory执行ADL存储中的U-SQL脚本 [英] Execute U-SQL script in ADL storage from Data Factory in Azure

查看:69
本文介绍了从Azure中的Data Factory执行ADL存储中的U-SQL脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在ADL存储中存储了一个USQL脚本,并且我正在尝试执行它.脚本文件很大-大约250Mb.

I have a USQL script stored on my ADL store and I am trying to execute it. the script file is quite big - about 250Mb.

到目前为止,我有一个数据工厂,我已经创建了一个链接服务,并且正在尝试创建一个Data Lake Analytics U-SQL活动.

So far i have a Data Factory, I have created a Linked Service and am trying to create a Data lake Analytics U-SQL Activity.

我的U-SQL活动的代码如下:

The code for my U-SQL Activity looks like this:

{
"name": "RunUSQLScript1",
"properties": {
    "description": "Runs the USQL Script",
    "activities": [
        {
            "name": "DataLakeAnalyticsUSqlActivityTemplate",
            "type": "DataLakeAnalyticsU-SQL",
            "linkedServiceName": "AzureDataLakeStoreLinkedService",

            "typeProperties": {

                "scriptPath": "/Output/dynamic.usql",
                "scriptLinkedService": "AzureDataLakeStoreLinkedService",
                "degreeOfParallelism": 3,
                "priority": 1000
            },
            "policy": {
                "concurrency": 1,
                "executionPriorityOrder": "OldestFirst",
                "retry": 3,
                "timeout": "01:00:00"
            },
            "scheduler": {
                "frequency": "Day",
                "interval": 1
            }
        }
    ],
    "start": "2017-05-02T00:00:00Z",
    "end": "2017-05-02T00:00:00Z"
}

}

但是,出现以下错误:

错误

活动'DataLakeAnalyticsUSqlActivityTemplate'没有输出,也没有计划.请添加>输出数据集或定义活动时间表.

Activity 'DataLakeAnalyticsUSqlActivityTemplate' from >pipeline 'RunUSQLScript1' has no output(s) and no schedule. Please add an >output dataset or define activity schedule.

我想要的是按需运行此活动,即,我根本不希望安排该活动,而且我也不了解我的情况是什么输入和输出.我尝试运行的U-SQL脚本正在处理我的ADL存储上的数百万个文件,并且在对内容进行了一些修改之后将它们保存.

What i would like is to have this Activity run on-demand, i.e. I do not want it scheduled at all, and also I do not understand what Inputs and Outputs are in my case. The U-SQL Script I am trying to run is operating on millions of files on my ADL storage and is saving them after some modifiction of the content.

推荐答案

当前ADF不支持按需执行活动,并且需要为其配置时间表.您将至少需要一个输出来驱动活动的计划执行.输出可以是虚拟的Azure存储,而无需实际写出数据,但是ADF利用可用性属性来驱动计划执行.例如:

Currently ADF does not support executing the activity on-demand and it needs to be configured with a schedule. You will need at least one output to drive the schedule execution of the activity. The output can be a dummy Azure Storage one without actually write the data out but ADF leverages the availability properties to drive the schedule execution. For example:

{
 "name": "OutputDataset",
 "properties": {
     "type": "AzureBlob",
     "linkedServiceName": "AzureStorageLinkedService",
     "typeProperties": {
         "fileName": "dummyoutput.txt",
         "folderPath": "adf/output",
         "format": {
             "type": "TextFormat",
             "columnDelimiter": "\t"
         }
     },
     "availability": {
         "frequency": "Day",
         "interval": 1
     }
 }
}

这篇关于从Azure中的Data Factory执行ADL存储中的U-SQL脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆