如何在Azure数据工厂中执行查找? [英] How to perform Lookups in Azure Data Factory?

查看:86
本文介绍了如何在Azure数据工厂中执行查找?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是SSIS开发人员.我在SSIS中做了很多SQL存储过程查找的概念.但是当来到Azure Data Factory时,我根本不知道如何使用SQL存储过程执行查找.

I'm a SSIS Developer. I do lots of SQL stored procedure lookup concepts in SSIS. But when coming to Azure Data Factory I haven't any idea how to perform a lookup using a SQL stored procedure.

有人可以指导我吗?

提前谢谢! 杰伊

推荐答案

Azure数据工厂(ADF)更像是ELT工具,而不是ETL,因此不支持直接查找.取而代之的是,这种类型的操作以及其他转换被压入您实际使用的计算中.例如,如果要将数据移动到SQL Server,Azure SQL数据库或Azure SQL数据仓库,则应确保所有数据都在同一服务器上,并使用

Azure Data Factory (ADF) is more of an ELT tool rather than ETL, therefore direct lookups are not supported. Instead, this type of operation, along with other transforms is pushed down into the compute you are actually using. For example, if you are moving data to SQL Server, Azure SQL Database or Azure SQL Data Warehouse, you would ensure all data is on the same server and use a Stored Procedure task to execute the lookups using T-SQL and joins. If you are using Azure Data Lake Analytics (ADLA) you would use the U-SQL Activity to run U-SQL or execute ADLA stored procedures, again doing lookups via joins or custom U-SQL code such as Combiner, Applier, Reducer. In fact you can use any of the ADF compute options like SQL, HDInsight (including Hive, Pig, Map Reduce, Streaming and Spark script), Machiine Learning or custom .net activities.

因此,您需要对ADF进行不同的思考.浏览本文,以更深入地了解如何在ADF中转换数据:

So you need to think about things differently with ADF. Have a look through this article to gain greater understanding of transforming data in ADF:

在Azure数据工厂中转换数据 https://docs.microsoft. com/en-us/azure/data-factory/data-factory-data-transformation-activities

顺便说一句,我很少在SSIS中使用Lookups,因为早期版本的性能曾经很差.尽管在更高版本中对此进行了改进,但是通常来说,如果可以在SQL中进行操作,则可能应该这样做.这种模式利用了SQL Server的功能,而不是将数据向上拖动到SSIS管道中,例如,出于查找(本质上是联接)的目的,然后再次将数据推出.我主要在涉及非关系数据(例如xml或将您的电子邮件服务器与关系数据结合在一起)时保留数据流转换.无论如何,这是我的个人观点:)

As an aside, I would rarely use Lookups in SSIS as performance in early versions used to be poor. Although this has been improved in later versions, generally if you can do it in SQL you probably should. This pattern harnesses the power of SQL Server, rather than dragging data up into the SSIS pipeline, eg for the purposes of lookups (which are essentially joins) and pushing the data back out again. I reserve Data Flow transformations mainly when non-relational data is involved, eg xml or joining your email server with relational data. This is my personal view anyway : )

这篇关于如何在Azure数据工厂中执行查找?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆