数据仓库中的时间和日期维度 [英] Time and date dimension in data warehouse

查看:677
本文介绍了数据仓库中的时间和日期维度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在建立一个数据仓库。每个事实都有其时间戳。我需要按天,月,季度但按小时创建报告。通过查看示例,我看到日期倾向于保存在维表中。

(来源: etl-tools.info

I'm building a data warehouse. Each fact has it's timestamp. I need to create reports by day, month, quarter but by hours too. Looking at the examples I see that dates tend to be saved in dimension tables.
(source: etl-tools.info)

但是我认为,时间没有意义。尺寸表会越来越大。另一方面,具有日期维表的JOIN比在 SQL 中使用日期/时间函数更有效。

But I think, that it makes no sense for time. The dimension table would grow and grow. On the other hand JOIN with date dimension table is more efficient than using date/time functions in SQL.

您有什么意见/解决方案?

What are your opinions/solutions ?

(我正在使用Infobright)

(I'm using Infobright)

推荐答案

我的猜测是,这取决于您的报告要求。
如果您需要类似

My guess is that it depends on your reporting requirement. If you need need something like

WHERE "Hour" = 10

表示每天10:00:00到10:59:59之间的时间,那么我会使用时间维度,因为它快于

meaning every day between 10:00:00 and 10:59:59, then I would use the time dimension, because it is faster than

WHERE date_part('hour', TimeStamp) = 10  

因为date_part()函数将针对每一行进行评估。
您仍应在事实表中保留时间戳,以便在几天的边界内进行汇总,例如:

because the date_part() function will be evaluated for every row. You should still keep the TimeStamp in the fact table in order to aggregate over boundaries of days, like in:

WHERE TimeStamp between '2010-03-22 23:30' and '2010-03-23 11:15' 

使用维度字段时会很尴尬。

which gets awkward when using dimension fields.

通常,时间维度具有分钟分辨率,因此有1440行。

Usually, time dimension has a minute resolution, so 1440 rows.

这篇关于数据仓库中的时间和日期维度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆