数据仓库用户数据 - 设计Q [英] Data warehouse for user data - design Q

查看:124
本文介绍了数据仓库用户数据 - 设计Q的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何最好地存储用户数据与日期/时间维度? Usecase是我正在尝试每小时存储用户操作。如股数,喜欢,朋友等我有一个时间表和一个日期表。时间很简单 - 我每天的每个小时都有每行= user_id和colunms = 1到24。但问题是日期。如果我给每一天= 1 colunm,那么我将有一年365科隆。我无法归档数据方式,因为分析也需要过去的数据。其他策略是什么?

How to best store user data vs date/time dimension? Usecase is I am trying to store user actions per day, per hour. Such as number of Shares, likes, friends etc. I have a time table and a date table. For time it is easy - i have each row = user_id and colunms = 1 to 24 for each hour of the day. But problem is for dates. If i give each day = 1 colunm then i will have 365 colunms a year. I cannot archive the data way either because analytic needs past data too. What are the other strategies?

推荐答案

dimDate : 1 row per date
dimTime : 1 row per minute

最初你必须说出 grain 事实表,然后坚持

At the beginning you have to state the "grain" of the fact table and then stick to it.

如果粮食是一天,那么 TimeKey 总是指向23:59的键。

If the grain is one day, then TimeKey always points to the key of "23:59".

如果粮食是一个小时,那么 TimeKey 指向HH:59的条目。

If the grain is one hour, then TimeKey points to entries of "HH:59".

如果粮食是一分钟,那么 TimeKey 指向相应的HH:MM

If the grain is one minute, then TimeKey points to the respective "HH:MM"

如果粮食是15分钟,那么 TimeKey 指向相应的HH :14,HH:29,HH:44,HH:59

If the grain is 15 minutes, then TimeKey points to the respective "HH:14", "HH:29", "HH:44", "HH:59"

等等...

-- How many new friends did specific user gain
-- in first three months of years 2008, 2009 and 2010
-- between hours 3 and 5 in the morning
-- by day of week
-- not counting holidays ?

select
      DayOfWeek
    , sum(NewFriends) as FriendCount
from factUserAction as f
join dbo.dimUser    as u on u.UserKey = f.UserKey
join dbo.dimDate    as d on d.DateKey = f.DateKey
join dbo.dimTime    as t on t.TimeKey = f.TimeKey
where CalendarYear between 2008 and 2010
  and MonthNumberInYear between 1 and 3
  and t.Hour between 3 and 5
  and d.IsHoliday = 'no'
  and UserEmail = 'john_doe@gmail.com' 
group by DayOfWeek
order by DayOfWeek ;

这篇关于数据仓库用户数据 - 设计Q的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆