电信数据的OLAP多维数据集设计问题 [英] OLAP Cube design issue for Telecommunication Data

查看:111
本文介绍了电信数据的OLAP多维数据集设计问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我正在分析通话详细记录(CDR)数据,以便根据客户的通话时间,通话时间(节假日或非节假日通话,商务通话)对客户进行细分或非商务电话),订户的年龄段和性别.数据来自两个表名称cdr (include card_number, service_key, calling, called, start_time, clear_time, duration column)subscriber_detail (include subscriber_name, subscriber_address, DOB, gender column) 我已经设计了如下所示的OLAP.

Background: I’m doing analysis of call detail record (CDR) data in order to segmentify customer with respect to their call duration, time of call (holiday call or non holiday call, Business call or non Business call), age group of subscriber and gender. Data is from two table name cdr (include card_number, service_key, calling, called, start_time, clear_time, duration column) and subscriber_detail (include subscriber_name, subscriber_address, DOB, gender column) I have design OLAP as given below.

Call_date包括带有日期,年月日的呼叫日期. Call_time是通话时间,以秒为单位.

Call_date includes Date of call with year, month, and day. Call_time is time of call happen in second.

问题:-如果我们将call_time以秒计,则它每天有86400列(可能是维数的诅咒),因此我们认为可以通过采用30秒的时间脉冲来减少其维数(电信根据脉搏收费,在我们的上下文中,脉搏持续时间为30). 第一个问题是:-这是用脉冲持续时间代替时间的最佳方法吗?并且 second 是:-如果一个用户在脉冲范围内进行了两次以上呼叫,则可能会引起问题,即第一次呼叫始于21:01:00,结束于21:01:05,然后他开始第二次在21:01:15致电并在21:01:20结束.如何解决这类问题.

Question:- if we take call_time in second then it has 86400 column for each day (may be curse of dimensionality) and so we think to reduce its dimensional by taking 30 second time pulse ( telecom charges money on the basic of the pulse and 30 is pulse duration for our context). First Question is :- Is it the best way to replace time by pulse duration? And second is :- if one subscriber do more than 2 call on range of pulse it may cause problem i.e. first call start at 21:01:00 and end at 21:01:05 and he start second call at 21:01:15 and end at 21:01:20. How to resolve these type of problem.

推荐答案

如果我是我,则将时间划分为10分钟,并使用链接列表在给定的时隙内存储多个持续时间,因此时间的总维数为144 (该限制最多只能滚动10分钟).

If I were you I would divide the time in 10 minute slot and use link list to store multiple duration time within given time slot so total dimension of time is 144 (Which restrict roll down upto 10 minutes only).

这篇关于电信数据的OLAP多维数据集设计问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆