将具有Gen1属性的文件从Gen1 Azur湖复制到Azur Gen2湖(如上次更新) [英] Copy Different type of file from Gen1 Azur lake to Azur Gen2 lake with attribute( like last updated)

查看:62
本文介绍了将具有Gen1属性的文件从Gen1 Azur湖复制到Azur Gen2湖(如上次更新)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将所有数据从Azur数据第1代湖迁移到第2代湖.在我的湖中,我们混合了不同类型的文件(.txt,.zip,.json和许多其他文件).我们想将它们原样移动到GEN2湖.除此之外,我们还希望维护所有文件的最新更新时间,即GEN1 lake.

I need to migrate all my data from Azur data lake Gen1 to Lake Gen2. In my lake we have different types of file mixed (.txt, .zip,.json and many other). We want to move them as-it-is to GEN2 lake. Along with that we also want to maintain last updated time for all files as GEN1 lake.

我一直在考虑将ADF用于此用例.但是为此,我们需要定义数据集,并定义数据集,我们必须定义数据格式(Avro,json,xml,二进制等).由于混合了不同类型的数据,因此我尝试使用二进制格式.但是以二进制格式,目的地处的所有文件都具有内容类型应用程序/八位流".也无法保留文件更新时间.

I was looking to use ADF for this use case. But for that we need to define dataset, and to define dataset we have to define data format(Avro,json,xml, binary etc). As we have different type of data mixed, I tried to use binary format. But with binary format all file at destination have content type "application/octate-stream". Also not able to retain file update time.

推荐答案

Last Modified Time是系统元数据,表示文件系统/容器中的修改,并且无法更新.解决方法是添加用户元数据以从源中捕获元数据,并且可以使用powershell/.net/java sdk来更新其他属性.下面的变通办法是在PowerShell中实现的

Last Modified Time is system metadata that represents that modification in the filesystem/container and it cannot be updated. Adding user meta data to capture meta data from the source is work around and powershell/.net/java sdk can be used for updating additional property. Below the workaround is implemented in PowerShell

这篇关于将具有Gen1属性的文件从Gen1 Azur湖复制到Azur Gen2湖(如上次更新)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆