正确使用 ARMAResult.predict() 函数的方法 [英] Correct way to use ARMAResult.predict() function
问题描述
根据这个问题 How使用 statsmodels 和 Python 在 AR 模型中获得常数项?.我现在正在尝试使用 ARMA 模型来拟合数据,但我再次找不到解释模型结果的方法.这是我根据 ARMA 使用 statsmodels 进行样本外预测 和 ARMAResults.predict API 文档.
According to this question How to get constant term in AR Model with statsmodels and Python?. I'm now trying to use the ARMA model to fit the data but again I couldn't find a way to interpret the model's result. Here what I have done according to ARMA out-of-sample prediction with statsmodels and ARMAResults.predict API document.
# Parameter
INPUT_DATA_POINT = 200
P = 5
Q = 0
# Read Data
data = []
f = open('stock_all.csv', 'r')
for line in f:
data.append(float(line.split(',')[5]))
f.close()
# Fit ARMA-model using the first piece of data
result = arma_model(data[:INPUT_DATA_POINT], P, Q)
# Predict using model (fit dimension is len(data) + 1 why?)
fit = result.predict(0, len(data))
# Plot
plt.figure(facecolor='white')
plt.title('ARMA Model Fitted Using ' + str(INPUT_DATA_POINT) + ' Data Points, P=' + str(P) + ' Q=' + str(Q) + '\n')
plt.plot(data, 'b-', label='data')
plt.plot(range(INPUT_DATA_POINT), result.fittedvalues, 'g--', label='fit')
plt.plot(range(len(data)), fit[:len(data)], 'r-', label='predict')
plt.legend(loc=4)
plt.show()
这里的结果很奇怪,因为它应该与我在上面的链接中提到的上一个问题的结果几乎相同.另外,我不太明白为什么有几个第一个数据点的结果,因为这不应该是有效的(没有以前的值要计算).
Here the result which is very strange because it should be nearly identical to the result from my last question as I mention in the link above. Also I'm not quite understand why there is some results for a couple of first data points since that shouldn't be valid (no previous value to compute).
我尝试编写自己的预测代码,如下所示(省略了与上述代码相同的顶部部分)
I try to write my own prediction code which is shown below (omitted the top part that is identical to the above code)
# Predict using model
start_pos = max(result.k_ar, result.k_ma)
fit = []
for t in range(start_pos, len(data)):
value = 0
for i in range(1, result.k_ar + 1):
value += result.arparams[i - 1] * data[t - i]
for i in range(1, result.k_ma + 1):
value += result.maparams[i - 1] * data[t - i]
fit.append(value)
# Plot
plt.figure(facecolor='white')
plt.title('ARMA Model Fitted Using ' + str(INPUT_DATA_POINT) + ' Data Points, P=' + str(P) + ' Q=' + str(Q) + '\n')
plt.plot(data, 'b-', label='data')
plt.plot(range(INPUT_DATA_POINT), result.fittedvalues, 'r+', label='fit')
plt.plot(range(start_pos, len(data)), fit, 'r-', label='predict')
plt.legend(loc=4)
plt.show()
这是我得到的最好的结果
This is the best result I got
推荐答案
您在数据子集上训练模型,然后预测样本外.AR(MA) 预测快速收敛到数据的均值.这就是为什么您会看到第一个结果.在您的第二个结果中,您不是在进行样本外预测,而是在获得样本外拟合值.
You trained the model on a subset of the data and then predict out of sample. AR(MA) prediction quickly converges to the mean of the data. That is why you see the first results. In your second results, you're not doing out of sample forecasting, you're just getting out-of-sample fitted values.
使用卡尔曼滤波器递归拟合前几个观测数据点(这是完全最大似然估计和条件最大似然估计之间的区别).
The first few observation data points are fit using the Kalman filter recursions (this is the distinction between full maximum likelihood estimates and conditional maximum likelihood estimates).
我会拿起一本好的预测教科书并复习它以了解这种行为.
I would pick up a good forecasting textbook and review it to understand this behavior.
这篇关于正确使用 ARMAResult.predict() 函数的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!