动态时间序列预测和滚动应用 [英] Dynamic time-series prediction and rollapply

查看:109
本文介绍了动态时间序列预测和滚动应用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对 R 中的动态时间序列进行滚动预测(然后计算预测的平方误差).我根据这个StackOverflow问题编写了很多代码,但我对R很陌生,所以我很挣扎.任何帮助将非常感激.

I am trying to get a rolling prediction of a dynamic timeseries in R (and then work out squared errors of the forecast). I based a lot of this code on this StackOverflow question, but I am very new to R so I am struggling quite a bit. Any help would be much appreciated.

require(zoo)
require(dynlm)

set.seed(12345)
#create variables
x<-rnorm(mean=3,sd=2,100)
y<-rep(NA,100)
y[1]<-x[1]
for(i in 2:100) y[i]=1+x[i-1]+0.5*y[i-1]+rnorm(1,0,0.5)
int<-1:100
dummydata<-data.frame(int=int,x=x,y=y)

zoodata<-as.zoo(dummydata)

prediction<-function(series)
  {
  mod<-dynlm(formula = y ~ L(y) + L(x), data = series) #get model
   nextOb<-nrow(series)+1
   #make forecast
   predicted<-coef(mod)[1]+coef(mod)[2]*zoodata$y[nextOb-1]+coef(mod)[3]*zoodata$x[nextOb-1]

   #strip timeseries information
   attributes(predicted)<-NULL

   return(predicted)
  }                

rolling<-rollapply(zoodata,width=40,FUN=prediction,by.column=FALSE)

返回:

20          21      .....      80
10.18676  10.18676          10.18676

这有两个我没想到的问题:

Which has two problems I was not expecting:

  1. 从 20->80 开始运行,而不是我预期的 40->100(因为宽度是 40)
  2. 它给出的预测是恒定的:10.18676

我做错了什么?有没有比将其全部写出来更容易进行预测的方法?谢谢!

What am I doing wrong? And is there an easier way to do the prediction than to write it all out? Thanks!

推荐答案

您的函数的主要问题是 dynlmdata 参数.如果您查看 ?dynlm,您会看到 data 参数必须是 data.framezoo目的.不幸的是,我刚刚了解到 rollapply 将您的 zoo 对象拆分为 array 对象.这意味着 dynlm 在注意到您的 data 参数格式不正确后,搜索了 xy 在您的全局 环境中,这当然是在您的代码顶部定义的.解决方案是将 series 转换为 zoo 对象.您的代码还有一些其他问题,我在此处发布了更正后的版本:

The main problem with your function is the data argument to dynlm. If you look in ?dynlm you will see that the data argument must be a data.frame or a zoo object. Unfortunately, I just learned that rollapply splits your zoo objects into array objects. This means that dynlm, after noting that your data argument was not of the right form, searched for x and y in your global environment, which of course were defined at the top of your code. The solution is to convert series into a zoo object. There were a couple of other issues with your code, I post a corrected version here:

prediction<-function(series) {
   mod <- dynlm(formula = y ~ L(y) + L(x), data = as.zoo(series)) # get model
   # nextOb <- nrow(series)+1 # This will always be 21. I think you mean:
   nextOb <- max(series[,'int'])+1 # To get the first row that follows the window
   if (nextOb<=nrow(zoodata)) {   # You won't predict the last one
     # make forecast
     # predicted<-coef(mod)[1]+coef(mod)[2]*zoodata$y[nextOb-1]+coef(mod)[3]*zoodata$x[nextOb-1]
     # That would work, but there is a very nice function called predict
     predicted=predict(mod,newdata=data.frame(x=zoodata[nextOb,'x'],y=zoodata[nextOb,'y']))
     # I'm not sure why you used nextOb-1  
     attributes(predicted)<-NULL
     # I added the square error as well as the prediction.
     c(predicted=predicted,square.res=(predicted-zoodata[nextOb,'y'])^2)
   }
}    

rollapply(zoodata,width=20,FUN=prediction,by.column=F,align='right')

您的第二个问题,关于结果的编号,可以由 align 参数控制,即 rollapply.left 会给你 1..60center(默认)会给你 20..80正确让你40..100.

Your second question, about the numbering of your results, can be controlled by the align argument is rollapply. left would give you 1..60, center (the default) would give you 20..80 and right gets you 40..100.

这篇关于动态时间序列预测和滚动应用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆