ARIMA 预测使用新的 python statsmodels 给出不同的结果 [英] ARIMA forecast gives different results with new python statsmodels
问题描述
我正在使用 ARIMA(0,1,0) 进行(样本外)预测.
I'm (out-of-sample) forecasting with ARIMA(0,1,0).
在 python 的 statsmodels 最新稳定版本 0.12 中.我计算:
In python's statsmodels latest stable version 0.12. I calculate:
import statsmodels.tsa.arima_model as stats
time_series = [2, 3.0, 5, 7, 9, 11, 13, 17, 19]
steps = 4
alpha = 0.05
model = stats.ARIMA(time_series, order=(0, 1, 0))
model_fit = model.fit(disp=0)
forecast, _, intervals = model_fit.forecast(steps=steps, exog=None, alpha=alpha)
结果
forecast = [21.125, 23.25, 25.375, 27.5]
intervals = [[19.5950036, 22.6549964 ], [21.08625835, 25.41374165], [22.72496851, 28.02503149], [24.44000721, 30.55999279]]
和未来警告,这表明:
FutureWarning:
statsmodels.tsa.arima_model.ARMA and statsmodels.tsa.arima_model.ARIMA have
been deprecated in favor of statsmodels.tsa.arima.model.ARIMA (note the .
between arima and model) and
statsmodels.tsa.SARIMAX. These will be removed after the 0.12 release.
在新版本中,正如未来警告中所暗示的那样,我计算:
In the new version, as hinted to in the Future Warning, I calculate:
import statsmodels.tsa.arima.model as stats
time_series = [2, 3.0, 5, 7, 9, 11, 13, 17, 19]
steps = 4
alpha = 0.05
model = stats.ARIMA(time_series, order=(0, 1, 0))
model_fit = model.fit()
forecast = model_fit.get_forecast(steps=steps)
forecasts_and_intervals = forecast.summary_frame(alpha=alpha)
给出不同的结果:
forecasts_and_intervals =
y mean mean_se mean_ci_lower mean_ci_upper
0 19.0 2.263842 14.562951 23.437049
1 19.0 3.201556 12.725066 25.274934
2 19.0 3.921089 11.314806 26.685194
3 19.0 4.527684 10.125903 27.874097
我想获得与以前相同的结果.我是否正确使用了新界面?
I would like to obtain the same results as before. Am I using the new interface correctly?
我需要预测和间隔.我已经尝试使用不同的功能,就像新界面提供的 forecast
一样.
I need both the forecast and the intervals.
I tried already to use different functions as just forecast
the new interface offers.
我特别想知道为什么整个列表的预测结果是 19.
In particular I'm wondering why the forecast result is 19 for the entire list.
非常感谢您的帮助.
这是 statsmodels 0.12.2 的文档:https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima_model.ARIMA.html?highlight=arima#statsmodels.tsa.arima_model.ARIMA
Here is the documentation for statsmodels 0.12.2: https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima_model.ARIMA.html?highlight=arima#statsmodels.tsa.arima_model.ARIMA
以下是新版 Arima 的文档:https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMA.html?highlight=arima#statsmodels.tsa.arima.model.ARIMA
Here is the documentation for newer version of Arima: https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMA.html?highlight=arima#statsmodels.tsa.arima.model.ARIMA
推荐答案
区别在于模型是否包含常量"任期与否.对于第一种情况,即较旧的 statsmodels.tsa.arima_model.ARIMA
,它会自动包含一个常数项(并且没有打开/关闭的选项).如果你有一个差异,它也包括它,但在差异域中这样做(否则无论如何它都会被消除).所以这是它的 ARIMA(0, 1, 0) 模型:
The difference is due to whether the models include a "constant" term or not. For the first case i.e. older statsmodels.tsa.arima_model.ARIMA
, it automatically includes a constant term (and no option to turn on/off). If you have a differencing, it also includes it but does so in the differenced domain (otherwise it would be eliminated anyway). So here is its ARIMA(0, 1, 0) model:
y_t - y_{t-1} = c + e_t
这是随漂移的随机游走".
which is "random walk with drift".
对于新的 statsmodels.tsa.arima.model.ARIMA
,正如您链接的文档所说,不是任何类型的趋势术语(包括常量,即 c
)涉及差分时包括在内,这就是您的情况.所以这是它的 ARIMA(0, 1, 0) 模型:
For the new statsmodels.tsa.arima.model.ARIMA
, as the documentation you linked says, not any kind of trend term (including constant, i.e. c
) is included when differencing is involved, which is the case for you. So here is its ARIMA(0, 1, 0) model:
y_t - y_{t-1} = e_t
这是随机游走"正如我们所知,它的预测对应于朴素的预测,即重复最后一个值(在您的情况下为 19).
which is "random walk" and as we know, forecasts from it corresponds to naive forecasts i.e. repeating the last value (19 in your case).
那么,如何做才能使新的工作?
Then, what to do to make the new one work?
它包含一个名为 trend
的参数,您可以指定该参数以获得相同的行为.由于您使用的是差分 (d=1),传递 trend="t"
应该给出与旧模型相同的模型.(t"
表示线性趋势,但由于 d = 1
,它将在差分域中减少为常数):
It includes a parameter called trend
which you can specify to get the same behaviour. Since you are using a differencing (d=1), passing trend="t"
should give the same model as the old one. ("t"
means linear trend but since d = 1
, it will reduce to a constant in the differenced domain):
import statsmodels.tsa.arima.model as stats
time_series = [2, 3.0, 5, 7, 9, 11, 13, 17, 19]
steps = 4
alpha = 0.05
model = stats.ARIMA(time_series, order=(0, 1, 0), trend="t") # only change is here!
model_fit = model.fit()
forecast = model_fit.get_forecast(steps=steps)
forecasts_and_intervals = forecast.summary_frame(alpha=alpha)
这是我得到的forecasts_and_intervals
:
y mean mean_se mean_ci_lower mean_ci_upper
0 21.124995 0.780622 19.595004 22.654986
1 23.249990 1.103966 21.086256 25.413724
2 25.374985 1.352077 22.724962 28.025008
3 27.499980 1.561244 24.439997 30.559963
这篇关于ARIMA 预测使用新的 python statsmodels 给出不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!