sktime ARIMA 无效频率 [英] sktime ARIMA invalid frequency

查看:33
本文介绍了sktime ARIMA 无效频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试从 sktime 包中拟合 ARIMA 模型.我导入一些数据集并将其转换为熊猫系列.然后我在训练样本上拟合模型,当我尝试预测错误发生时.

I try to fit ARIMA model from sktime package. I import some dataset and convert it to pandas series. Then I fit the model on the train sample and when I try to predict the error occurs.

from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.forecasting.arima import ARIMA
import numpy as np, pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',
                 parse_dates=['date']).set_index('date').T.iloc[0]
p, d, q = 3, 1, 2
y_train, y_test = temporal_train_test_split(df, test_size=24)
model = ARIMA((p, d, q))
results = model.fit(y_train)
fh = ForecastingHorizon(y_test.index, is_relative=False,)

# the error is here !!
y_pred_vals, y_pred_int = results.predict(fh, return_pred_int=True)

错误信息如下:

ValueError: Invalid frequency. Please select a frequency that can be converted to a regular
`pd.PeriodIndex`. For other frequencies, basic arithmetic operation to compute durations
currently do not work reliably.

我在读取数据集时尝试使用 .asfreq("M"),但是,该系列中的所有值都变为 NaN.
有趣的是,这段代码适用于来自 sktime.datasets 的默认 load_airline 数据集,但不适用于我来自 github 的数据集.

I tried to use .asfreq("M") while reading the dataset, however, all the values in the series become NaN.
What is interesting is that this code works with the default load_airline dataset from sktime.datasets but not with my dataset from github.

推荐答案

我得到一个不同的错误:ValueError: ``unit`` missing,可能是由于版本差异.无论如何,我认为最好将数据帧的索引设为 pd.PeriodIndex 而不是 pd.DatetimeIndex.前者我认为更明确(例如,每月系列的时间步长不是确切的日期)并且工作更顺畅.所以在阅读了 csv 之后,

I get a different error: ValueError: ``unit`` missing, possibly due to version difference. Anyhow, I'd say it is better to have your dataframe's index as pd.PeriodIndex instead of pd.DatetimeIndex. The former is I think more explicit (e.g. monthly series has its time-steps as periods not exact dates) and works more smoothly. So after reading the csv,

df.index = pd.PeriodIndex(df.index, freq="M")

应该清除错误(在我的版本中确实如此;0.5.1):

should clear the error (it does in my version; 0.5.1):

这篇关于sktime ARIMA 无效频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆