SSIS步骤将CSV从Azure Blob加载到Azure SQL [英] SSIS steps to load CSV from Azure blob to Azure SQL

查看:253
本文介绍了SSIS步骤将CSV从Azure Blob加载到Azure SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要连接到Azure Blob(源)中的CSV文件,然后将数据加载到Azure SQL Server表中,然后将CSV文件移动到另一个(存档)Azure Blob中.

I need to connect to a CSV file in an Azure blob (source), then load the data into an Azure SQL Server table, then move the CSV file to a different (archive) Azure blob.

在没有Azure的情况下,我将创建到本地文件的平面文件连接,然后使用Source Assistant和Data Flow task目标助手将数据加载到SQL Server表中,然后将Control Flow中的File System task加载到存档目录中.

Without Azure, I would create a flat file connection to a local file, then a Data Flow task using Source Assistant & Destination Assistant to load the data into a SQL Server table, then a File System task in Control Flow to move the file to the Archive directory.

我想做类似的事情,直接连接到Azure blob中的文件,然后在执行数据流"任务之后,执行一个文件从一个Azure blob(源)移动到另一个Azure Blob(归档)的操作.

I'd like to do something similar, with a direct connection to the file in the Azure blob, then after the Data Flow task, do a file Move from one Azure blob (source) to another Azure Blob (archive).

我能想到的最好的办法是使用Azure Blob Download task将CSV移至运行SSIS的Azure VM(顺便说一句,您可以在没有VM的情况下获得Azure SSIS服务吗?),然后在下载后创建一个flat file connection& Data Flow加载数据,然后对Archive Blob执行Azure Blob Upload task.

The best I can figure is to use the Azure Blob Download task to move the CSV to the Azure VM where SSIS is running (by the way, can you get an Azure SSIS service without a VM?), then after download create a flat file connection & Data Flow to load the data, then perform an Azure Blob Upload task to the Archive Blob.

似乎应该有一种方法可以连接到源Azure blob文件&直接从中读取,而无需先下载.同样,似乎应该有一种在Azure Blob容器之间移动文件的方法.我能想到的最好的选择是download/upload选项,但是它需要一个中间位置(本地目录),并且下载后不会删除源文件.有SSIS Azure工具可以执行这些任务吗?

Seems there should be a way to connect to the source Azure blob file & read from it directly without having to download it first. Likewise, seems there should be a way to move files between Azure blob containers. Best I can come up with is the download/upload option, but that requires an intermediate location (local directory), and doesn't remove the source file after download. Are there SSIS Azure tools to do these tasks?

推荐答案

对问题的另一半有什么想法:在处理完文件后将文件从源blob移到存档blob吗?

Any ideas for the other half of the problem: moving the files from the Source blob to an Archive blob after processing them?

如我所知,没有内置任务可帮助您实现此目的.根据我的测试,我假设您可以利用脚本任务,并编写代码(VB或C#)以直接处理Blob.这是我的详细步骤,您可以参考它们:

As I known, there is no in-build task for you to achieve this purpose. Based on my test, I assume that you could leverage Script Task and write code (VB or C#) to handle blobs directly. Here are my detailed steps, you could refer to them:

1)使用 Azure Blob源数据流下的rel ="noreferrer"> OLE DB目标,用于将CSV文件从Azure Blob加载到Azure SQL数据库中.

1) Use Azure Blob Source and OLE DB Destination under Data Flow for loading a CSV file from Azure Blob into Azure SQL database.

2)将CSV数据成功加载到SQL表中之后,请使用脚本任务将源Blob移至存档Blob.

2) After successfully load the CSV data into SQL table, use a Script Task for moving the source blob to an archive blob.

我将调用Blob服务REST API 复制Blob 删除Blob 容器SAS令牌,您可以利用 Microsoft Azure存储资源管理器并遵循此官方

I would invoke Blob Service REST API copy Blob and Delete Blob with Container SAS token, you could leverage Microsoft Azure Storage Explorer and follow this official tutorial to generate the SAS token for your blob container.

假设源blob和目标blob在同一容器下,然后按如下所示添加三个变量(SourceBlobUrlContainerSasTokenArchiveBlobUrl),并在脚本任务编辑器中将它们添加为ReadOnlyVariables ,您可以参考此

Assuming the source blob and destination blob are under the same container, then I add three variables (SourceBlobUrl,ContainerSasToken,ArchiveBlobUrl) as follows and add them as ReadOnlyVariables in Script Task Editor, you could refer to this tutorial for using Variables in the Script Task.

单击脚本任务编辑器"下的"编辑脚本"按钮,以启动用于在其中编写自定义脚本的VSTA开发环境.这是ScriptMain.cs下的Main方法,如下所示:

Click Edit Script button under Script Task Editor to launch the VSTA development environment in which you write your custom script. Here is the Main method under ScriptMain.cs as follows:

public async void Main()
{
    // TODO: Add your code here
    string sasToken = Dts.Variables["ContainerSasToken"].Value.ToString();
    string sourceBlobUrl = Dts.Variables["SourceBlobUrl"].Value.ToString();
    string archiveBlobUrl = Dts.Variables["ArchiveBlobUrl"].Value.ToString();

    try
    {
        HttpClient client = new HttpClient();
        client.DefaultRequestHeaders.Add("x-ms-copy-source", sourceBlobUrl + sasToken);
        //copy source blob to archive blob
        Dts.Log($"start copying blob from [{sourceBlobUrl}] to [{archiveBlobUrl}]...", 0, new byte[0]);
        HttpResponseMessage response = await client.PutAsync(archiveBlobUrl + sasToken, null);
        if (response.StatusCode == HttpStatusCode.Accepted || response.StatusCode == HttpStatusCode.Created)
        {
            client.DefaultRequestHeaders.Clear();
            Dts.Log($"start deleting blob [{sourceBlobUrl}]...", 0, new byte[0]);
            //delete source blob
            HttpResponseMessage result = await client.DeleteAsync(sourceBlobUrl + sasToken);
            if (result.StatusCode == HttpStatusCode.Accepted || result.StatusCode == HttpStatusCode.Created)
            {
                Dts.TaskResult = (int)ScriptResults.Success;
                return;
            }
        }
        Dts.TaskResult = (int)ScriptResults.Failure;
    }
    catch (Exception ex)
    {
        Dts.Events.FireError(-1, "Script Task - Move source blob to an archive blob", ex.Message + "\r" + ex.StackTrace, String.Empty, 0);
        Dts.TaskResult = (int)ScriptResults.Failure;
    }
}

结果

此外,您还可以利用 Microsoft .NET的Microsoft Azure存储客户端库,以访问存储Blob,此时,您需要将程序集加载到GAC之外的SSIS脚本任务中,有关更多详细信息,请参考此官方

Additionally, you could also leverage Microsoft Azure Storage Client Library for .NET to access storage blob, at this point, you need to load the assembly in a SSIS script task that is not in the GAC, for more details you could refer to this official blog.

这篇关于SSIS步骤将CSV从Azure Blob加载到Azure SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆