每天移动100 GB数据 [英] Moving 100 GB of data on a daily basis
问题描述
我想使用Azure Data Factory管道将100 GB的数据从Azure数据湖移动到Azure表存储.
I would like to use Azure Data Factory pipeline to move 100 GB of data from azure data lake to azure table storage.
此刻,我在ADL中有一个100 GB的文件,其中有几列:
At the moment I have a single 100 GB file in ADL that has few columns:
分区(带有2个字符的字符串)
Partition (string with 2 characters)
键(字符串)
日期(日期)
Column1(整数)
Column1 (integer)
Column2(整数)
Column2 (integer)
我试图运行管道,但是它的运行速度比预期的要慢得多(如果我继续进行下去,它将运行几天).请帮助我学习如何调查ADF性能以及应尝试更改的参数.
I've tried to run the pipeline but it was working much slower then expected (if I let it continue, it will run for several days). Please help me learn how to investigate ADF performance and what parameters I should try changing.
有用的信息:
-所有Azure服务都位于同一数据中心中.
- All Azure services are located in the same data center.
-ADL中的数据按PartitionKey排序.
- Data in ADL is sorted by PartitionKey.
-我已设置"复制并行度"复制活动到64.
-我尝试增加'
Tatyana Yakushev [tableau.com]
Tatyana Yakushev [tableau.com]
推荐答案
Hello Tatyana,
Hello Tatyana,
您知道您的活动实现了哪些吞吐量"和"usedParallelCopies"吗?如果您还没有看到的话,这里有有关复制活动监视的更多信息:
Do you know what 'throughput' and 'usedParallelCopies' your activity is achieving? There is more information on copy activity monitoring here if you haven't seen it:
https://docs.microsoft.com/zh-CN /azure/data-factory/copy-activity-overview#monitoring
这篇关于每天移动100 GB数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!