动态时间序列预测和滚动应用 [英] Dynamic time-series prediction and rollapply
问题描述
我正在尝试对 R 中的动态时间序列进行滚动预测(然后计算预测的平方误差).我根据这个StackOverflow问题编写了很多代码,但我对R很陌生,所以我很挣扎.任何帮助将非常感激.
I am trying to get a rolling prediction of a dynamic timeseries in R (and then work out squared errors of the forecast). I based a lot of this code on this StackOverflow question, but I am very new to R so I am struggling quite a bit. Any help would be much appreciated.
require(zoo)
require(dynlm)
set.seed(12345)
#create variables
x<-rnorm(mean=3,sd=2,100)
y<-rep(NA,100)
y[1]<-x[1]
for(i in 2:100) y[i]=1+x[i-1]+0.5*y[i-1]+rnorm(1,0,0.5)
int<-1:100
dummydata<-data.frame(int=int,x=x,y=y)
zoodata<-as.zoo(dummydata)
prediction<-function(series)
{
mod<-dynlm(formula = y ~ L(y) + L(x), data = series) #get model
nextOb<-nrow(series)+1
#make forecast
predicted<-coef(mod)[1]+coef(mod)[2]*zoodata$y[nextOb-1]+coef(mod)[3]*zoodata$x[nextOb-1]
#strip timeseries information
attributes(predicted)<-NULL
return(predicted)
}
rolling<-rollapply(zoodata,width=40,FUN=prediction,by.column=FALSE)
返回:
20 21 ..... 80
10.18676 10.18676 10.18676
这有两个我没想到的问题:
Which has two problems I was not expecting:
- 从 20->80 开始运行,而不是我预期的 40->100(因为宽度是 40)
- 它给出的预测是恒定的:10.18676
我做错了什么?有没有比将其全部写出来更容易进行预测的方法?谢谢!
What am I doing wrong? And is there an easier way to do the prediction than to write it all out? Thanks!
推荐答案
您的函数的主要问题是 dynlm
的 data
参数.如果您查看 ?dynlm
,您会看到 data
参数必须是 data.frame
或 zoo
目的.不幸的是,我刚刚了解到 rollapply
将您的 zoo
对象拆分为 array
对象.这意味着 dynlm
在注意到您的 data
参数格式不正确后,搜索了 x
和 y
在您的全局 环境中,这当然是在您的代码顶部定义的.解决方案是将 series
转换为 zoo
对象.您的代码还有一些其他问题,我在此处发布了更正后的版本:
The main problem with your function is the data
argument to dynlm
. If you look in ?dynlm
you will see that the data
argument must be a data.frame
or a zoo
object. Unfortunately, I just learned that rollapply
splits your zoo
objects into array
objects. This means that dynlm
, after noting that your data
argument was not of the right form, searched for x
and y
in your global environment, which of course were defined at the top of your code. The solution is to convert series
into a zoo
object. There were a couple of other issues with your code, I post a corrected version here:
prediction<-function(series) {
mod <- dynlm(formula = y ~ L(y) + L(x), data = as.zoo(series)) # get model
# nextOb <- nrow(series)+1 # This will always be 21. I think you mean:
nextOb <- max(series[,'int'])+1 # To get the first row that follows the window
if (nextOb<=nrow(zoodata)) { # You won't predict the last one
# make forecast
# predicted<-coef(mod)[1]+coef(mod)[2]*zoodata$y[nextOb-1]+coef(mod)[3]*zoodata$x[nextOb-1]
# That would work, but there is a very nice function called predict
predicted=predict(mod,newdata=data.frame(x=zoodata[nextOb,'x'],y=zoodata[nextOb,'y']))
# I'm not sure why you used nextOb-1
attributes(predicted)<-NULL
# I added the square error as well as the prediction.
c(predicted=predicted,square.res=(predicted-zoodata[nextOb,'y'])^2)
}
}
rollapply(zoodata,width=20,FUN=prediction,by.column=F,align='right')
您的第二个问题,关于结果的编号,可以由 align
参数控制,即 rollapply
.left
会给你 1..60
,center
(默认)会给你 20..80
和正确
让你40..100
.
Your second question, about the numbering of your results, can be controlled by the align
argument is rollapply
. left
would give you 1..60
, center
(the default) would give you 20..80
and right
gets you 40..100
.
这篇关于动态时间序列预测和滚动应用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!