使用 SCD1 和 SCD2 属性加载混合维度表 + SSIS [英] Loading Hybrid Dimension Table with SCD1 and SCD2 attributes + SSIS

查看:28
本文介绍了使用 SCD1 和 SCD2 属性加载混合维度表 + SSIS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始一项新任务,其中我需要使用 SCD1 和 SCD2 加载混合维度表.这需要作为 SSIS 包来实现.有人可以指导在 SSIS 中处理这个问题的最佳方法是什么,我应该使用 SCD 组件还是有其他方法?这方面的最佳做法是什么.

对于 SCD2 类型,使用 Merge 语句.

谢谢

解决方案

那是一堆蠕虫 :)

在 SSIS 中基本上有四种处理 SCD 的方法:1.使用内置的SCD组件2. 使用查找、条件拆分、派生列和各种目标滚动您自己的".3.使用T-SQL MERGE4.使用第三方Kimball SCD组件

我会提醒你我对 #4 的偏见 - 我写的.但这是我对这群人的分析.

1 是小"和简单"维度的一个很好的解决方案.为什么好?它可以理解,处理 SCD 1 和 2,并且易于设置.但为什么只有小"和简单"的维度?因为它使用无法改进的内部未缓存查找 (RBAR).因为如果您更改其中的任何内容(重新运行向导),它会破坏您对数据流所做的任何更改.并且因为它不会处理区分大小写不重要或尾随空格不重要的行.

2 是较大尺寸的一个很好的解决方案.它很好,因为它表现得非常好,并且有据可查",因为您可以从您使用的组件的名称以及它们如何组合在一起,确切地看到它在做什么.操作和更改操作方式也很容易.缺点是设置和测试需要时间.

3 是大尺寸的一个很好的解决方案.它通常优于所有其他替代方案.但这就是它所做的一切.编写代码非常复杂,如果没有大量注释就不太容易理解.

4 对于几乎任何尺寸都是一个很好的解决方案,除了巨大"的尺寸.它像库存 SCD 组件一样易于"使用,性能与 2 一样好或更好,并且与 2 一样可配置.

这里有关于 4 的更多信息.>

I am just in a process of starting a new task, wherein in i need to load Hybrid Dimension Table with SCD1 and SCD2. This need to be achieved as a SSIS Package. Can someone guide what would be the best way dealing this in SSIS, should i used SCD component or there is other way? What are the best practices for this.

For SCD2 type, am using Merge statement.

Thanks

解决方案

That's a can of worms :)

There are basically four ways to handle SCDs in SSIS: 1. Using the built-in SCD component 2. "Rolling your own" using Lookups, Conditional Splits, Derived Columns, and various destinations. 3. Using T-SQL MERGE 4. Using the third party Kimball SCD component

I'll alert you to my bias towards #4 - I wrote it. But here's my analysis of the bunch.

1 is a good solution for "small" and "easy" dimensions. Why is it good? It's understandable, handles SCD 1 and 2, and is easy to set up. But why only "small" and "easy" dimensions? Because it uses an internal uncached lookup (RBAR) that can't be improved. Because if you change anything in it (re-run the wizard), it destroys any changes you've made to the data flow. And because it won't handle rows where case sensitivity isn't important, or trailing spaces aren't important.

2 is a good solution for larger dimensions. It's good because it performs pretty well, and is "well documented" in that you can see exactly what it's doing from the names of the components you use and how they're put together. It's also easy to manipulate and change how it operates. The downside is that it takes time to set up and test.

3 is a good solution for huge dimensions. It usually outperforms all other alternatives. But that's about all it has going for it. It's very complex to code, and not very understandable without tons of comments.

4 is a good solution for just about any size except maybe "huge" dimensions. It's "easy" to use like the stock SCD component, performs as good or better than 2, and is as configurable as 2.

More info on 4 here.

这篇关于使用 SCD1 和 SCD2 属性加载混合维度表 + SSIS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆