R中的时间序列分析:ts()函数中的频率值与acf图中的滞后 [英] Time Series Analysis in R : Frequency value in ts() function vs lag in acf plot

查看:8
本文介绍了R中的时间序列分析:ts()函数中的频率值与acf图中的滞后的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是时间序列分析的新手.我有 60 个月的月度销售数据,从 2009 年 1 月到 20013 年 12 月,并试图通过 ARIMA 模型预测未来 6 个月的销售量.我读取数据并将其转换为时间序列对象,如下所示:

I am a newbie in time series analysis. I am having monthly sales data for 60 months, from January-2009 to December-20013, and trying to forecast sales for upcoming 6 months via ARIMA model. I read the data and convert it into time series object as follow :

 data <- read.csv(file="monthlySalesData.csv", header=TRUE)
 dataInTimeSeris <- ts(data, frequency = 12, start=c(2009,1), end=c(2013,12)) 

当我尝试绘制 acf() 图来确定我的自相关下降到零之后的滞后时,我会在 X 轴上以小数形式获得滞后比例.我没有足够的权限发布图像,但 X 轴上的滞后值是十进制的,最大滞后为 1.5 .plot=FALSE 的 acf 值也很奇怪(它不显示已计算自相关的滞后).我无法解释这一点,也无法找到自相关下降为零的滞后数.

When I try to draw acf() plot to determine the lag after which my auto-correlation is dying down to zero, then I get scale of lag on X-axis in decimals. I am not having enough privilege to post image, but lag values on X-axis are in decimal with max lag as 1.5 . The acf values with plot=FALSE also come strange (It does not show lag for which it has calculated auto-correlation). I am not able to interpret this, and not able to find number of lags after which auto-correlation is dying down to zero.

 acf(dataInTimeSeries, plot=FALSE)

Autocorrelations of series ‘dataInTimeSeries’, by lag

0.0000 0.0833 0.1667 0.2500 0.3333 0.4167 0.5000 0.5833 0.6667 0.7500 0.8333 
 1.000  0.642  0.588  0.490  0.401  0.320  0.311  0.269  0.178  0.198  0.229 
0.9167 1.0000 1.0833 1.1667 1.2500 1.3333 1.4167 
 0.271  0.358  0.240  0.210  0.092  0.135  0.098 

问题是什么 - R 设置、数据导入或 ts() 函数是否有问题?如果这是 acf plots 显示每月数据的方式,如何解释它?

What is the issue - is there any problem with R settings, or data import or ts() function? And if this is how acf plots shows for monthly data, how to interpret it ?

提前致谢!!

推荐答案

你看到的小数点就是年份,例如0.0833 = 1/12 年,0.1667 = 2/12 年.

The decimals you see are just years, e.g. 0.0833 = 1/12 year, 0.1667 = 2/12 year aso.

要获得滞后数月的 ACF 图,您可以尝试以下方法:

To get the ACF plot with lags as months you can try something like:

## Lacking reproducible example, I use simulated monthly data 
tt <- ts(arima.sim(list(order=c(1,0,0), ar=0.4),60), start=2001, deltat=1/12)
## Calculate, but not plot, acf
acfpl <- acf(tt, plot=FALSE)
## Transform the lags from years to months
acfpl$lag <- acfpl$lag * 12

## Plot the acf 
plot(acfpl, xlab="Lag (months)")

据我了解,您正在处理的问题是识别 ARMA 的订单.为此,您需要 ACF 和 PACF 图.当你说濒临零"时,你不应该期望值在一些滞后后等于零.95% 置信区间内的值(蓝色虚线)没有统计学意义(另请查看 ?plot.acf 中的注释).

As I understand your problem you are dealing with is identifying the orders of ARMA. To do that you need both the ACF and PACF plots. When you say "dying to zero" you should not expect the values to be equal to zero after some lag. Values inside the 95% confidence interval (dashed blue lines) are not statistically significant (check also the notes in ?plot.acf).

识别 ARIMA 模型的顺序可能很棘手,但您可以遵循一些规则.例如.过程 AR(p) 模型具有类似阻尼指数/正弦函数的 ACF,而 PACF 具有 p 显着滞后.例如.MA(q) 过程是相反的.

Identifying the order of an ARIMA model can be tricky, but there are some rules you can follow. E.g. processes AR(p) models have ACF like a damped exponential/sine function and PACF having p significant lags. E.g. MA(q) processes are the other way round.

就这两个简单案例的样子而言,我使用 arima.sim 来模拟两个时间序列,ARIMA(1,0,0) 和 ARIMA(0,0,1).

Just to how it looks like for these two simple cases, I use arima.sim to simulate two time series, ARIMA(1,0,0) and ARIMA(0,0,1).

set.seed(1234)
arima100 <- arima.sim(list(order=c(1,0,0), ar=0.9), n=500)
arima001 <- arima.sim(list(order=c(0,0,1), ma=0.9), n=500)

par(mfrow=c(2,2), bycol=TRUE)
acf(arima100); acf(arima001)
pacf(arima100); pacf(arima001)

这会产生以下情节:

ARIMA(1,0,0):ACF 衰减到零,而 PACF 有一个明显的滞后.ARIMA(0,0,1):ACF 有一个明显的滞后(在 lag-0 始终为 1 之后),PACF 看起来像一个阻尼正弦函数.

ARIMA(1,0,0): ACF decays towards zero, and PACF has one significant lag. ARIMA(0,0,1): ACF has one significant lag (after lag-0 which is always 1), and PACF appears like a damped sine function.

现在,仅通过查看您的 ACF,我敢说两件事:

Now, just by looking at your ACF, I would dare say two things:

  • 您的流程可能有一个 AR 术语(也必须检查 PACF)
  • 您的数据可能存在季节性,因为第 12 次滞后(即一年)的峰值(您可以通过查看数据图来检查)

您可以遵循的一些步骤:

Some steps you can follow:

  • 如果您的数据中存在明显的趋势,请进行差异分析
  • 如果您有年度季节性,请计算滞后 12 的差异
  • 绘制未差分和差分数据的 ACF 和 PACF
  • 拟合模型 arima 并检查残差
  • 如果您有多个候选模型,请比较它们的 AIC 或 BIC 值.

还阅读了一本好书(我使用了 Henrik Madsen 的时间序列分析)或讲义(这些看起来不错)可以帮助你很多.

Also reading a good book (I used Time Series Analysis by Henrik Madsen) or lecture notes (these look good) can help you a lot.

这篇关于R中的时间序列分析:ts()函数中的频率值与acf图中的滞后的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆