在Datamart中添加事实的有效方法? [英] Efficient way to add facts in a Datamart?

查看:95
本文介绍了在Datamart中添加事实的有效方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在WebApi服务器上运行的SQL Server中有一个Datamart。有一个事实表和多个Dim表。

I have a Datamart in SQL Server running in a WebApi server. There are a Fact table and multiple Dim tables.

维度值可以动态增长,因此,在接收到事实数据时,有必要检查Dim表。如果不存在,则需要将维度值添加到Dim表中。

The dimension values can grow dynamically, so, when the fact data is received, it is necessary to check the Dim table. If it does not exists, then I need to add the dimension value to the Dim table.

最后,我需要使用外键将事实记录插入到Fact表中

Finally I need to insert the fact record to Fact table, using the foreign keys to the Dim tables.

我需要为每个事实记录重复该过程。在Datamart系统中用动态维数据添加事实的有效方法是什么?

I need to repeat the process for each fact record. Which is the efficient way to add facts with dynamic dimensional data in a Datamart system?

推荐答案

我刚刚回答了类似的问题( https://stackoverflow.com/a/29433398/3964881 ),但我意识到这两个问题不是同样,尽管需要非常相似的答案。

I've just answered something quite similar (https://stackoverflow.com/a/29433398/3964881), but I realise the two questions aren't the same, despite needing quite similar answers.

正如我在另一个答案中指出的那样,通常的模式是先填充维度,然后填充事实(s)。因此,为了争辩,假设您有一个人维度和一个城镇维度。您应该有一个过程,可以从源系统中获取有关每个人员的所有相关信息,并将所有新信息加载到人员维度中。同样,另一个过程应该从您的源系统中提取所有Town信息,并将任何新信息加载到Town Dimension。

As I've noted in the other answer, the usual pattern is to populate your Dimension(s) first, and then your Fact(s). So for argument's sake, let's say you have a Person Dimension and a Town Dimension. You should have a process which pulls all of the relevant information about each Person from your source system(s), and loads any new ones into the Person Dimension. Likewise, another process should pull all of the Town information from your source system(s), and load any new ones into the Town Dimension.

完成后,您的事实负载可以运行,并且可以简单地查找Person和Town的代理键/ ID值,而无需检查该值是否存在,如果不存在则添加它。如果您的过程正常工作,那么它们肯定已经存在。

Once that's done, your Fact load can run and can simply look up the surrogate key/ID values for Person and Town, with no need to check whether the value exists and add it if not. If your process is working correctly they will definitely already exist.

这是一个更合理的解决方案,以防出现多个引用相同维的Fact表的情况-否则,您可能会遇到这样的情况,即必须按一定顺序加载事实表以确保Dimension数据就位,或者甚至可能冒着两个事实表加载过程都已加载的重复Dimension数据结束的风险同时使用相同的新数据。

This is a much sounder solution in case you get to a point where you have multiple Fact tables referencing the same Dimensions - otherwise, you could end up in a situation where your Fact tables have to load in a certain order to ensure the Dimension data is in place, or you could even risk ending up with duplicated Dimension data where two Fact table load processes have both loaded in the same "new" data at the same time.

这篇关于在Datamart中添加事实的有效方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆