不同更新时间表的事实表 [英] Fact Table with Different Update Schedules

查看:258
本文介绍了不同更新时间表的事实表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两套具有相同等级的数据,例如发票编号。大多数数据需要每天更新,因为我们认识到以前发票的收入。然而,这些数据中的一些通过一个单独的成本计算系统每月进行一次,然后通过附加信息被馈送到数据仓库。我应该创建一个包含两组数据的事实表,然后在导入其他数据时每月一次对事实表运行一次更新,或者由于不同的更新计划,我应该创建两个事实表。数据是相关的,许多查询(〜35%)将需要来自两组数据的信息(当可用时)。该系统每天导入30,000行,事实表中有大约38,000,000行,每月更新将影响66万行。

解决方案

提供已经存在的措施在第二步没有修改,您可以将事实表视为累积快照。
该表描述了具有确定性开始和​​结束类型的工作流的流程。在Kimball的数据仓库工具包中查找它,或者只是GoogleKimball累积快照事实表。


I have two sets of data with the same level of grainularity, for example invoice number. Most of the data required is updated daily as we recognize the revenue for previous invoices. However, some of this data is fed through a seperate costing system once a month and is then fed to the data warehouse with additional information. Should I create one fact table that contains both sets of data, and then run an update on the fact table once a month when the other data is imported in, or should I create two fact tables because of the different update schedule. The data is related, and many queries (~35%) will want information from both sets of data (when avaliable). The system imports 30,000 rows a day into the fact table has about 38,000,000 rows in it, the monthly update would affect 660,000 rows.

解决方案

Providing that already existing measures are not modified in the second step, you could treat the fact table as an "accumulating snapshot". The table describes processes with a definitive start and the end -- kind of workflows. Look it up in Kimball's Data Warehouse Toolkit or just Google "Kimball accumulating snapshot fact table".

这篇关于不同更新时间表的事实表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆