解释ARIMA模型的预测 [英] Explaining the forecasts from an ARIMA model

查看:596
本文介绍了解释ARIMA模型的预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试向自己解释将ARIMA模型应用于时间序列数据集的预测结果.数据来自M1竞赛,系列为MNB65.我正在尝试将数据拟合为ARIMA(1,0,0)模型并获取预测.我正在使用R.以下是一些输出摘要:

I am trying to explain to myself the forecasting result from applying an ARIMA model to a time-series dataset. The data is from the M1-Competition, the series is MNB65. I am trying to fit the data to an ARIMA(1,0,0) model and get the forecasts. I am using R. Here are some output snippets:

> arima(x, order = c(1,0,0))
Series: x 
ARIMA(1,0,0) with non-zero mean 
Call: arima(x = x, order = c(1, 0, 0)) 
Coefficients:
         ar1  intercept
      0.9421  12260.298
s.e.  0.0474    202.717

> predict(arima(x, order = c(1,0,0)), n.ahead=12)
$pred
Time Series:
Start = 53 
End = 64 
Frequency = 1 
[1] 11757.39 11786.50 11813.92 11839.75 11864.09 11887.02 11908.62 11928.97 11948.15 11966.21 11983.23 11999.27

我有几个问题:

(1)我该如何解释,尽管数据集显示出明显的下降趋势,但该模型的预测却呈上升趋势? ARIMA(2,0,0)也会发生这种情况,这是最适合使用auto.arima(预测包)的数据和ARIMA(1,0,1)模型的ARIMA.

(1) How do I explain that although the dataset shows a clear downward trend, the forecast from this model trends upward? This also happens for ARIMA(2,0,0), which is the best ARIMA fit for the data using auto.arima (forecast package) and for an ARIMA(1,0,1) model.

(2)ARIMA(1,0,0)模型的截距值为12260.298.截距不应满足以下公式:C = mean * (1 - sum(AR coeffs)),在这种情况下,值应为715.52.我一定在这里缺少一些基本的东西.

(2) The intercept value for the ARIMA(1,0,0) model is 12260.298. Shouldn't the intercept satisfy the equation: C = mean * (1 - sum(AR coeffs)), in which case, the value should be 715.52. I must be missing something basic here.

(3)显然,这是一个具有非平稳均值的序列.为什么auto.arima仍将AR(2)模型选为最佳模型?可以有一个直观的解释吗?

(3) This is clearly a series with non-stationary mean. Why is an AR(2) model still selected as the best model by auto.arima? Could there be an intuitive explanation?

谢谢.

推荐答案

  1. 因为该模型是固定的,所以ARIMA(p,0,q)模型不会考虑趋势.如果您确实要包含趋势,请使用带有漂移项的ARIMA(p,1,q)或ARIMA(p,2,q). auto.arima()表明存在0个差异的事实通常表明没有明显的趋势.

  1. No ARIMA(p,0,q) model will allow for a trend because the model is stationary. If you really want to include a trend, use ARIMA(p,1,q) with a drift term, or ARIMA(p,2,q). The fact that auto.arima() is suggesting 0 differences would usually indicate there is no clear trend.

arima()的帮助文件显示截距实际上是平均值.也就是说,AR(1)模型是(Y_t-c) = ϕ(Y_{t-1} - c) + e_t而不是您期望的Y_t = c + ϕY_{t-1} + e_t.

The help file for arima() shows that the intercept is actually the mean. That is, the AR(1) model is (Y_t-c) = ϕ(Y_{t-1} - c) + e_t rather than Y_t = c + ϕY_{t-1} + e_t as you might expect.

auto.arima()使用单位根测试确定所需的差异数.因此,请检查单元根测试的结果以了解发生了什么.如果您认为单位根测试未得出合理的模型,则始终可以在auto.arima()中指定所需的差异数.

auto.arima() uses a unit root test to determine the number of differences required. So check the results from the unit root test to see what's going on. You can always specify the required number of differences in auto.arima() if you think the unit root tests are not leading to a sensible model.

以下是对您的数据进行的两次测试的结果:

Here are the results from two tests for your data:

R> adf.test(x)

        Augmented Dickey-Fuller Test

data:  x 
Dickey-Fuller = -1.031, Lag order = 3, p-value = 0.9249
alternative hypothesis: stationary 

R> kpss.test(x)

        KPSS Test for Level Stationarity

data:  x 
KPSS Level = 0.3491, Truncation lag parameter = 1, p-value = 0.09909

因此,ADF表示强烈不平稳(在这种情况下为零假设),而KPSS并没有完全拒绝平稳(该测试的零假设). auto.arima()默认情况下使用后者.如果要进行第一次测试,可以使用auto.arima(x,test="adf").在这种情况下,建议模型ARIMA(0,2,1)确实具有趋势.

So the ADF says strongly non-stationary (the null hypothesis in that case) while the KPSS doesn't quite reject stationarity (the null hypothesis for that test). auto.arima() uses the latter by default. You could use auto.arima(x,test="adf") if you wanted the first test. In that case, it suggests the model ARIMA(0,2,1) which does have a trend.

这篇关于解释ARIMA模型的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆