Azure数据工厂与SSIS [英] Azure Data Factories vs SSIS

查看:166
本文介绍了Azure数据工厂与SSIS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在考虑将SSIS ETL迁移到Azure数据工厂.我支持这种飞跃的观点是:

I am thinking of moving our SSIS ETLs to Azure Data Factory. My arguments in favour of such leap are:

  • 我们的源和目标已经在云中. ADF是云原生的,因此看起来很合适.

  • Our sources and targets are already in the cloud. ADF is cloud native so it seems at good fit.

ADF是一项服务,因此我们可以按需消费和付款. SSIS暗示许可成本,并且不自然地将其借给按需使用(我们考虑使用DevOps临时设置ETL服务器)

ADF is a service are therefore we could consume and pay for it on demand. SSIS implies licensing costs, and doesn't lend lend it itself naturally for on-demand consumption (we thought of using DevOps to spin ETL servers on an ad-hoc basis)

使用SSIS以编程方式生成ETL代码需要非常具体的技能,例如BIML或DTS API.通过迁移到ADF,我希望JSON与TSQL和USQL中的C#结合使用将使必要的技能更加通用.

Generating ETL code programmatically with SSIS requires very specific skills such as BIML or the DTS API. By moving to ADF I am hoping the combination of JSON and the TSQL and C# in USQL will make the necessary skills more generic.

我希望社区中的成员可以分享他们的经验,从而帮助我做出决定.

I am hoping members of the community can share their experiences and thus help me come to a decision.

推荐答案

此旧帖子的答案已经过时了.我在下面的评论与ADF版本2有关.

The answers to this old post are quite outdated. My comments below are related to ADF version 2.

首先,ADF具有运行SSIS包的能力,因此不仅可以迁移旧的ETL进程,而且可以逐步迁移到ADF,但建议这样做.您不想随着每一项新技术的出现而改变一切.然后,您只能在ADF活动上实施新的或修改的ETL流程.

First of all, ADF has the capability to run SSIS packages, so moving your legacy ETL processes there and moving to ADF incrementally is not only possible but recommended. You don't want to change everything with every new piece of technology that comes out. You can then only implement new or modified ETL processes on ADF activities.

第二,虽然可能还不完全存在,但是使用ADF数据流,您可以进行转换,而可以使用SSIS.仍然缺少一些零碎的东西,但是大多数常用功能在那里.

Secondly, although maybe not completely there yet, with ADF dataflows you can do transformations you can do with SSIS. There are still some missing bits and pieces, but most of the commonly used functionality is there.

ADF创作不需要Visual Studio.它确实需要特定的技能,但是我发现学习曲线并不陡峭.在某些领域仍然缺少文档和最佳实践,但是已经在数据库/数据仓库体系结构和ETL中具有丰富经验的人会发现它相对容易.最好的是,大多数事情都可以在视觉上完成,而不会弄乱代码(这只是简单的JSON).

ADF authoring does not require Visual Studio. It does need specific skills but I found the learning curve not to be steep. Documentation and best practices are still a bit lacking in certain areas, but someone already experienced in database / data warehouse architecture and ETL will find it relatively easy. The best thing about it is that most things can be done visually without messing with the code (which is just simple JSON).

此外,ADF与Azure Devops集成并使用Gi​​t进行版本控制.因此,您可以免费获得变更管理.

Furthermore, ADF integrates with Azure Devops and uses Git for versioning. So you get change management for free.

对于更高级的需求,您还可以使用Java(Scala)或Python运行Databricks活动,并与Hadoop(Hive和Pig)和Spark集成.

For the more advanced needs you can also run Databricks activities with Java (Scala) or Python, integrate with Hadoop (Hive and Pig) and Spark.

最后,ADF包含了监视和诊断工具,您必须在SSIS中自行构建.您可以更轻松地查看哪个活动失败以及错误是什么.

Finally, ADF incorporates monitoring and diagnostic tools which in SSIS you had to build yourself. You can see much more easily which activity failed and what the error was.

这篇关于Azure数据工厂与SSIS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆