将数据从Azure Blob存储复制到AWS S3 [英] Copy Data From Azure Blob Storage to AWS S3

查看:409
本文介绍了将数据从Azure Blob存储复制到AWS S3的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Azure数据工厂的新手,并且有一个有趣的要求.

I am new to Azure Data Factory and have an interesting requirement.

我需要将文件从Azure Blob存储移动到Amazon S3,最好使用Azure数据工厂.

I need to move files from Azure Blob storage to Amazon S3, ideally using Azure Data Factory.

但是不支持将S3作为接收器;

However S3 isnt supported as a sink;

https://docs.microsoft.com/zh-cn/azure/data-factory/copy-activity-overview

我还从这里阅读的各种评论中了解到,您不能直接从Blob存储复制到S3-您需要在本地下载文件,然后将其上传到S3.

I also understand from a variety of comments i've read on here that you cannot directly copy from Blob Storage to S3 - you would need to download the file locally and then upload it to S3.

没有人知道在Data Factory,SSIS或Azure Runbook中可以做到这一点的任何示例吗,我想一个选择是编写一个从Data Factory调用的Azure逻辑应用程序或函数.

Does anyone know of any examples, in Data factory, SSIS or Azure Runbook that can do such a thing, I suppose an option would be to write an azure logic-app or function that is called from Data Factory.

推荐答案

设法解决此问题-对其他人可能有用.

Managed to get something working on this - it might be useful for someone else.

我决定编写一个使用HTTP请求作为触发器的Azure函数.

I decided to write an azure function that uses a HTTP request as a trigger.

这两篇文章对我有很大帮助;

These two posts helped me a lot;

如何使用NuGet我的Azure Functions中的软件包?

使用C#从Azure Blob复制到AWS S3

如果您使用的是Azure函数2.x,请注意我对Nuget软件包的回答.

Please note my answer to the Nuget packages if you are using Azure functions 2.x.

这里是代码-您可以根据需要修改此基础. 我返回JSON序列化对象,因为Azure数据工厂需要此作为来自管道发送的http请求的响应;

Here is the code - you can modify the basis of this to your needs. I return a JSON Serialized object because Azure Data Factory requires this as a response from a http request sent from a pipeline;

#r "Microsoft.WindowsAzure.Storage"
#r "Newtonsoft.Json"
#r "System.Net.Http"

using System.Net;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using Newtonsoft.Json;
using Microsoft.WindowsAzure.Storage.Blob;
using System.Net.Http;
using Amazon.S3; 
using Amazon.S3.Model;
using Amazon.S3.Transfer;
using Amazon.S3.Util;


public static async  Task<IActionResult> Run(HttpRequest req, ILogger log)
{
    log.LogInformation("Example Function has recieved a HTTP Request");

    // get Params from query string
    string blobUri = req.Query["blobUri"];
    string bucketName = req.Query["bucketName"];

    // Validate query string
    if (String.IsNullOrEmpty(blobUri) || String.IsNullOrEmpty(bucketName)) {

        Result outcome = new Result("Invalid Parameters Passed to Function",false,"blobUri or bucketName is null or empty");
        return new BadRequestObjectResult(outcome.ConvertResultToJson());
    }

    // cast the blob to its type
    Uri blobAbsoluteUri = new Uri(blobUri);
    CloudBlockBlob blob = new CloudBlockBlob(blobAbsoluteUri);

    // Do the Copy
    bool resultBool = await CopyBlob(blob, bucketName, log);

    if (resultBool) { 
        Result outcome = new Result("Copy Completed",true,"Blob: " + blobUri + " Copied to Bucket: " + bucketName);
        return (ActionResult)new OkObjectResult(outcome.ConvertResultToJson());       
    }
    else {
        Result outcome = new Result("ERROR",false,"Copy was not successful Please review Application Logs");
        return new BadRequestObjectResult(outcome.ConvertResultToJson()); 
    }  
}

static async Task<bool> CopyBlob(CloudBlockBlob blob, string existingBucket, ILogger log) {

        var accessKey = "myAwsKey";
        var secretKey = "myAwsSecret";
        var keyName = blob.Name;

        // Make the client 
        AmazonS3Client myClient = new AmazonS3Client(accessKey, secretKey, Amazon.RegionEndpoint.EUWest1);

        // Check the Target Bucket Exists; 
        bool bucketExists = await AmazonS3Util.DoesS3BucketExistAsync (myClient,existingBucket);

        if (!bucketExists) {
            log.LogInformation("Bucket: " + existingBucket + " does not exist or is inaccessible to the application");
            return false;
        }

        // Set up the Transfer Utility
        TransferUtility fileTransferUtility = new TransferUtility(myClient);

        // Stream the file
        try {

            log.LogInformation("Starting Copy");

            using (var stream = await blob.OpenReadAsync()) {

                // Note: You need permissions to not be private on the source blob
                log.LogInformation("Streaming");

                await fileTransferUtility.UploadAsync(stream,existingBucket,keyName);

                log.LogInformation("Streaming Done");   
            }

            log.LogInformation("Copy completed");
        }
        catch (AmazonS3Exception e) {
                log.LogInformation("Error encountered on server. Message:'{0}' when writing an object", e.Message);
            }
        catch (Exception e) {
                log.LogInformation("Unknown encountered on server. Message:'{0}' when writing an object", e.Message);
                return false;
        }

        return true; 
    }

public class Result {

    public string result;
    public bool outcome;
    public string UTCtime;
    public string details; 

    public Result(string msg, bool outcomeBool, string fullMsg){
        result=msg;
        UTCtime=DateTime.Now.ToString("yyyy-MM-dd h:mm:ss tt");
        outcome=outcomeBool;
        details=fullMsg;
    }

    public string ConvertResultToJson() {
        return JsonConvert.SerializeObject(this);
    } 
}

这篇关于将数据从Azure Blob存储复制到AWS S3的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆