使用 ts() 对象对每周数据进行 R 时间序列建模 [英] R time series modeling on weekly data using ts() object

查看:42
本文介绍了使用 ts() 对象对每周数据进行 R 时间序列建模的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据如下每周数据使用 R 进行时间序列建模和预测:

I am trying to do time series modeling and forecasting using R based on weekly data like below:

biz week     Amount        Count
2006-12-27   973710.7     816570
2007-01-03  4503493.2    3223259
2007-01-10  2593355.9    1659136
2007-01-17  2897670.9    2127792
2007-01-24  3590427.5    2919482
2007-01-31  3761025.7    2981363
2007-02-07  3550213.1    2773988
2007-02-14  3978005.1    3219907
2007-02-21  4020536.0    3027837
2007-02-28  4038007.9    3191570
2007-03-07  3504142.2    2816720
2007-03-14  3427323.1    2703761
...
2014-02-26  99999999.9   1234567

关于我的数据:如上所示,每周都按第一天标记(我的一周从周三开始,到周二结束).当我构建我的 ts 对象时,我尝试了

About my data: As seen above, each week is labeled by first day for the week (my week starts on Wed. and ends on Tues). When I construct my ts object, I tried

ts <- ts(df, frequency=52, start=c(2007,1))

我遇到的问题是:

1) 某些年份可能有 53 周,因此 frequency=52 不适用于那些年份;

1) Some year may have 53 weeks, so frequency=52 will not work for those years;

2) 我的起始周/日期是 2006-12-27,我应该如何设置 start 参数?start=c(2006,52)start=c(2007,1) 自 2006-12-27 周以来真的越界了吗?此外,对于建模,是否最好拥有完整的年份数据(如果我只有部分年份的数据,那么 2007 年是我的开始年份),最好不要使用 2007 年,而是从 2008 年开始?2014 年呢:既然还不是完整的一年,我是否应该使用我拥有的模型进行建模?无论哪种方式,我仍然对是否将这些周包括在 2006-12-27 之类的年份边界中存在疑问.我应该将它作为 wk 1 包含在 2007 年还是 2006 年的最后一周?

2) My starting week/date is 2006-12-27, how should I set the start parameter? start=c(2006,52) or start=c(2007,1) since week of 2006-12-27 really cross the year boundary? Also, for modeling, is it better to have complete year worth of data (say for 2007 my start year if I only have partial year worth of data), is it better not to use 2007, instead to start with 2008? What about 2014: since it is not a complete year yet, should I use what I have for modeling or not? Either way, I still have an issue with whether or not to include those weeks in the year boundary like 2006-12-27. Should I include it as wk 1 for 2007 or the last week of 2006?

3) 当我使用 ts <- ts(df, frequency=52, start=c(2007,1)) 然后打印它时,我得到了如下所示的结果,所以改为2007.01, 2007.02, 2007.52..., 我得到了 2007.000, 2007.019, ...,它是从 1/52=0.019 得到的.这在数学上是正确的,但并不容易解释.有没有办法像数据框一样将其标记为日期本身,或者至少是 2007 wk1, 2007 wk2...

3) When I use ts <- ts(df, frequency=52, start=c(2007,1)) and then print it, I got the results shown below, so instead of 2007.01, 2007.02, 2007.52..., I got 2007.000, 2007.019, ..., which it gets from 1/52=0.019. This is mathematically correct but not really easy to interpret. Is there a way to label it as the date itself just like a data frame or at least 2007 wk1, 2007 wk2...

==========

Time Series:
Start = c(2007, 1) 
End = c(2014, 11) 
Frequency = 52 
          Amount        Count
2007.000   645575.4     493717
2007.019  2185193.2    1659577
2007.038  1016711.8     860777
2007.058  1894056.4    1450101
2007.077  2317517.6    1757219
2007.096  2522955.8    1794512
2007.115  2266107.3    1723002 

4) 我的目标是对每周数据进行建模,然后尝试对其进行分解以查看季节性成分.似乎我必须使用 ts() 函数来转换为一个 ts 对象 sp,我可以使用 decompose() 函数.我尝试了 xts() 并且收到一个错误,指出 时间序列没有或少于 2 个周期".我猜这是因为 xts() 不会让我指定频率,对吧?

4) My goal is to model this weekly data and then try to decompose it to see seasonal components. It seems like I have to use the ts() function to convert to a ts object sp that I can use the decompose() function. I tried xts() and I got an error stating " time series has no or less than 2 periods". I guess this is because xts() won't let me specify the frequency, right?

xts <- xts(df,order.by=businessWeekDate)

5)我也在这个论坛和其他地方寻找答案;大多数示例都是每月一次,尽管每周都有一些时间序列问题,但没有一个答案是直接的.希望有人能在这里帮助回答我的问题.

5) I looked for the answer in this forum and other places as well; most of the examples are monthly, and though there are some weekly time series questions, none of the answers are straightforward. Hopefully somebody can help answer my questions here.

推荐答案

使用非整数频率效果很好,并且与大多数模型(auto.arima、ets 等)兼容.对于开始日期,我只使用 lubridate 中的便利功能.在处理可能不同的开始和结束日期的多个时间序列时,这里的重要性是保持一致.

Using non-integer frequencies works quite well and is compatible with most models (auto.arima, ets, ...). For the start date, I just use the convenience functions in lubridate. The importance here is to be consistent when working with multiple time series of potentially different start and end dates.

library(lubridate)
ts(df$Amount, 
   freq=365.25/7, 
   start=decimal_date(ymd("2006-12-27")))

这篇关于使用 ts() 对象对每周数据进行 R 时间序列建模的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆