使用Azure Data Factory从C#代码运行U-SQL脚本 [英] Run U-SQL Script from C# code with Azure Data Factory

查看:91
本文介绍了使用Azure Data Factory从C#代码运行U-SQL脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过C#代码在Azure上运行U-SQL脚本.在执行代码后,所有内容都在蔚蓝中创建(ADF,链接服务,管道,数据集),但ADF不执行U-SQl脚本.我认为在管道代码中配置的startTime和end Time存在问题.

I am trying to Run an U-SQL script on Azure by C# code. Everything is created on azure (ADF, linkedservices, pipelines, data sets) after code gets executed but U-SQl script is not executed by ADF. I think there is an issue with startTime and end Time configured in pipeline code.

我关注了以下文章,以完成此控制台应用程序. 创建,监视和管理使用Data Factory .NET SDK的Azure数据工厂

I followed following article to complete this console application. Create, monitor, and manage Azure data factories using Data Factory .NET SDK

这是我完整的C#代码项目的URL,可供下载. https://1drv.ms/u/s!AltdTyVEmoG2ijOupx-EjCM-8Zk4

Here is the URL of my complete C# code project for download. https://1drv.ms/u/s!AltdTyVEmoG2ijOupx-EjCM-8Zk4

请有人帮我找出我的错误

Someone please help me to find out my mistake

用于配置管道的C#代码:

C# code to configure pipeline:

DateTime PipelineActivePeriodStartTime =新的DateTime(2017,1,12,0,0,0,0,DateTimeKind.Utc); DateTime PipelineActivePeriodEndTime = PipelineActivePeriodStartTime.AddMinutes(60); 字符串PipelineName ="ComputeEventsByRegionPipeline";

DateTime PipelineActivePeriodStartTime = new DateTime(2017, 1, 12, 0, 0, 0, 0, DateTimeKind.Utc); DateTime PipelineActivePeriodEndTime = PipelineActivePeriodStartTime.AddMinutes(60); string PipelineName = "ComputeEventsByRegionPipeline";

        var usqlparams = new Dictionary<string, string>();
        usqlparams.Add("in", "/Samples/Data/SearchLog.tsv");
        usqlparams.Add("out", "/Output/testdemo1.tsv");

        client.Pipelines.CreateOrUpdate(resourceGroupName, dataFactoryName,
        new PipelineCreateOrUpdateParameters()
        {
            Pipeline = new Pipeline()
            {
                Name = PipelineName,
                Properties = new PipelineProperties()
                {
                    Description = "This is a demo pipe line.",

                    // Initial value for pipeline's active period. With this, you won't need to set slice status
                    Start = PipelineActivePeriodStartTime,
                    End = PipelineActivePeriodEndTime,
                    IsPaused = false,

                    Activities = new List<Activity>()
                    {
                        new Activity()
                        {
                            TypeProperties = new DataLakeAnalyticsUSQLActivity("@searchlog = EXTRACT UserId int, Start DateTime, Region string, Query string, Duration int?, Urls string, ClickedUrls string FROM @in USING Extractors.Tsv(nullEscape:\"#NULL#\"); @rs1 = SELECT Start, Region, Duration FROM @searchlog; OUTPUT @rs1 TO @out USING Outputters.Tsv(quoting:false);")
                            {
                                DegreeOfParallelism = 3,
                                Priority = 100,
                                Parameters = usqlparams
                            },
                            Inputs = new List<ActivityInput>()
                            {
                                new ActivityInput(Dataset_Source)
                            },
                            Outputs = new List<ActivityOutput>()
                            {
                                new ActivityOutput(Dataset_Destination)
                            },
                            Policy = new ActivityPolicy()
                            {
                                Timeout = new TimeSpan(6,0,0),
                                Concurrency = 1,
                                ExecutionPriorityOrder = ExecutionPriorityOrder.NewestFirst,
                                Retry = 1
                            },
                            Scheduler = new Scheduler()
                            {
                                Frequency = "Day",
                                Interval = 1
                            },
                            Name = "EventsByRegion",
                            LinkedServiceName = "AzureDataLakeAnalyticsLinkedService"
                        }
                    }
                }
            }
        });

我刚刚在azure数据工厂视图中注意到了一些东西(监视和管理"选项).管道的状态为正在等待:DatasetDependencies . 是否需要为此在代码中进行某些修改?

I just noticed something in azure data factory view (Monitor and Manage option). The status of Pipeline is Waiting : DatasetDependencies. Do I need to modify something in code for this?

推荐答案

如果没有其他活动正在创建源数据集,则需要向其添加属性

If you don't have another activity that is creating your source dataset, you need to add to it the attribute

"external": true

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-faq

https://docs.microsoft .com/zh-CN/azure/data-factory/data-factory-create-datasets

这篇关于使用Azure Data Factory从C#代码运行U-SQL脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆