使用日期维度表直接存储日期的优点是什么? [英] What is the advantage of using a date dimension table over directly storing a date?

查看:466
本文介绍了使用日期维度表直接存储日期的优点是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要存储相当大的数据历史。我一直在研究存储这种存档的最佳方式。似乎数据仓库方法是我需要处理的。似乎强烈建议使用日期维度表而不是日期本身。任何人都可以向我解释为什么一个单独的表会更好吗?我不需要总结任何数据,只需在过去的任何一天快速有效地访问它。我确定我错过了一些东西,但是我看不出如何将日期存储在一个单独的表格中比在我的存档中存储日期更好。



我发现这些启发性的帖子,但没有什么可以回答我的问题。




解决方案

嗯,一个优点是,作为一个维度,你可以在其他表中存储日期的许多其他属性 - 是一个假期,是一个工作日,哪个财政季度,特定(或多个)时区的UTC偏移量等等。其中一些您可以在运行时计算,但在很多如果您只是将DATE存储在表中,那么只有一个选项可以指示一个缺少的(或者只有可能的)预先计算。



日期(NULL),或者您需要开始弥补无意义的令牌日期,如1900-01-01,意思是一件事(因为你不知道丢失)和1899-12-31意味着另一个(因为任务仍在运行而丢失,这个人还活着,等等)。如果您使用维度,则可以有多行代表DATE未知/缺失的特定原因,没有任何魔术值。



个人而言,我宁愿只是存储一个DATE,因为它比INT(!)小,它保存各种日期相关属性,执行日期数学等的能力。如果日期丢失的原因很重要,我可以随时添加一个列到表中以指示。但我正在回答别人的数据仓库帽子。


I have a need to store a fairly large history of data. I have been researching the best ways to store such an archive. It seems that a datawarehouse approach is what I need to tackle. It seems highly recommended to use a date dimension table rather than a date itself. Can anyone please explain to me why a separate table would be better? I don't have a need to summarize any of the data, just access it quickly and efficiently for any give day in the past. I'm sure I'm missing something, but I just can't see how storing the dates in a separate table is any better than just storing a date in my archive.

I have found these enlightening posts, but nothing that quite answers my question.

解决方案

Well, one advantage is that as a dimension you can store many other attributes of the date in that other table - is it a holiday, is it a weekday, what fiscal quarter is it in, what is the UTC offset for a specific (or multiple) time zone(s), etc. etc. Some of those you could calculate at runtime, but in a lot of cases it's better (or only possible) to pre-calculate.

Another is that if you just store the DATE in the table, you only have one option for indicating a missing date (NULL) or you need to start making up meaningless token dates like 1900-01-01 to mean one thing (missing because you don't know) and 1899-12-31 to mean another (missing because the task is still running, the person is still alive, etc). If you use a dimension, you can have multiple rows that represent specific reasons why the DATE is unknown/missing, without any "magic" values.

Personally, I would prefer to just store a DATE, because it is smaller than an INT (!) and it keeps all kinds of date-related properties, the ability to perform date math etc. If the reason the date is missing is important, I could always add a column to the table to indicate that. But I am answering with someone else's data warehousing hat on.

这篇关于使用日期维度表直接存储日期的优点是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆