如何构建自引用表 [英] how to build self referencing table

查看:25
本文介绍了如何构建自引用表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

源表中有两列如下图所示:

In the source table, there are two columns as following snapshot shows:

那么对于目标表,它应该是这样的:

Then for destination table, it should be something like this:

(DimLocationKey"是自动生成的代理键)

如何在SSIS中实现自引用效果?我尝试了以下方法,但它不起作用,因为查找中没有匹配项.

How could I achieve self-referencing effect in SSIS? I tried following approach but it's not working because there would be no matches in the lookup.

推荐答案

如果该列可以为空,那么您可以加载 location_ID 的唯一值,然后让辅助进程返回并负责更新现有的和可能添加的新的.

If the column is nullable, then you could load the unique values for location_ID and then have a secondary process come back through and take care of updating existing and possibly adding new.

1 NULL A NULL
2 NULL B NULL 
3 NULL C NULL
4 NULL D NULL

我想如果它不可为空,那么您可以在数据流中预先计算这些 id,并将当前行和父级分配给它们自己.作为一名开发人员,我可能会因此而讨厌你 ;)

I suppose if it's not nullable, then you could precompute those ids in a data flow and assign current row and parent to themselves. As a developer, I might hate you for that though ;)

此时,问题就变成了表中应该有 8 行还是 4 行(无论您的源数据指示什么).这成为商业用户的一个问题,适当地愚蠢".我在我的等级问题中看到了两个答案——总统向谁报告?"在一个地方,总统没有向任何人报告,这意味着费用申请会自动获得批准.一个不同的地方让CEO向自己报告,这意味着他们的费用报告仍然需要他们自己批准.我想这是为了确保他们有执行责任,因为没有什么是自动的.

At this point, it becomes a question of whether there should be 8 rows in the table or 4 (whatever your source data indicates). This becomes a question for business users, appropriately "dumbed down". I've seen both answers in my hierarchy questions - "Who does the President report to?" At one place, the President reported to no one which meant expense requests were automatically approved. A different place had the CEO report to themselves which meant their expense reports still had to be approved by themselves. I guess it was to ensure they had executive accountability as nothing was automagic.

如果答案是 8 行,那么您的数据流看起来应该是正确的.如果它是 4,那么您将使用现有的数据流,但改为更新行.如果它是一小组行,数百行,那么您可以使用 OLEDB 命令并编写更新语句.只需意识到它将为命中组件的每一行发出 UPDATE 语句.这可能会使您的处理陷入停顿,因为它非常低效.

If the answer is 8 rows, then your data flow would look about right. If it's 4, then you'd use the existing data flow but update the rows instead. If it's a small set of rows, hundreds, then you can use the OLEDB Command and write your update statement. Just realize that it will issue an UPDATE statement for every row that hits the component. That can bring your processing to a standstill as it's terribly inefficient.

更有效的更新路径是使用 OLE DB 目标,在数据流完成后,让执行 SQL 任务发出基于集合的 UPDATE 语句.请参阅 Andy Leonard 的 Stairway to Integration Services 系列,了解有关如何执行的详细示例这样做.

The more efficient route for updates is to use the OLE DB destination and the after the Data Flow completed, have an Execute SQL task issue a set-based UPDATE statement. See Andy Leonard's Stairway to Integration Services series for a well written example of how to do this.

如果它不可为空并且不允许节点引用自己,那么您的数据模型似乎没有准确描述

If it's not nullable and nodes referencing themselves is not allowed, then it seems your data model does not accurately describe

这篇关于如何构建自引用表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆