在 R 中外推时间序列数据 [英] extrapolate in R for a time-series data

查看:23
本文介绍了在 R 中外推时间序列数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有过去 20 年的时间序列数据.该变量每年都被测量,所以我有 20 个值.我有一个制表符分隔的文件,第一列代表年份,第二列代表值.这是它的样子:

<前>1991 4381992 4081993 3811994 3611995 3381996 3151997 2891998 2611999 2292000 2062001 1902002 1732003 1512004 1412005 1262006 1082007 992008 932009 852010 772011 712012 67

我想推断未来几年第二列的价值.第二列中的值下降的速度也在下降,所以我认为我们不能使用线性回归.我想知道第二列将在哪一年接近零值.我从未使用过 R,所以如果您能帮助我编写用于从制表符分隔的文件中读取数据的代码,那就太好了.

谢谢

解决方案

以下是可以帮助您入门的草图.

##获取数据tmp <- read.table(text="1991 4381992 4081993 3811994 3611995 3381996 3151997 2891998 2611999 2292000 2062001 1902002 1732003 1512004 1412005 1262006 1082007 992008 932009 852010 772011 712012 67", col.names=c("Year", "value"))图书馆(ggplot2)## 开发模型tmp$pred1 <- 预测(lm(价值〜poly(Year,2),数据= tmp))##查看数据p1 <- ggplot(tmp, aes(x = Year, y=value)) +geom_line() +geom_point() +geom_hline(aes(yintercept=0))打印(p1)##检查模型p1 +geom_line(aes(y = pred1), color="red")##基于模型外推pred <- data.frame(Year=1990:2050)pred$value <- 预测(lm(value ~ poly(Year, 2), data=tmp),newdata=pred)p1 +geom_line(颜色=红色",数据=预测)

在这种情况下,我们的模型说这条线永远不会过零.如果这没有意义,那么您将需要选择不同的模型.无论您选择何种模型,都可以将结果与数据一起绘制出来,这样您就可以了解自己的表现如何.

I have a time-series data for the last 20 years. The variable has been measured every year so I have 20 values. I have a tab-delimited file with first column representing year and second column the value. Here is what it looks like :

1991    438
1992    408
1993    381
1994    361
1995    338
1996    315
1997    289
1998    261
1999    229
2000    206
2001    190
2002    173
2003    151
2004    141
2005    126
2006    108
2007    99
2008    93
2009    85
2010    77
2011    71
2012    67

I want to extrapolate the value of second column for coming years. The rate at which values in second column is decreasing is also going down so I think we can't use linear regression. I wish to know in which year the second column will approach the value of zero. I have never used R so it would be great if you can even help me with code that will be used to read the data from a tab-delimited file.

Thanks

解决方案

The following is a sketch that may help you get started.

## get the data
tmp <- read.table(text="1991    438
1992    408
1993    381
1994    361
1995    338
1996    315
1997    289
1998    261
1999    229
2000    206
2001    190
2002    173
2003    151
2004    141
2005    126
2006    108
2007    99
2008    93
2009    85
2010    77
2011    71
2012    67", col.names=c("Year", "value"))

library(ggplot2)

## develop a model
tmp$pred1 <- predict(lm(value ~ poly(Year, 2), data=tmp))

## look at the data
p1 <- ggplot(tmp, aes(x = Year, y=value)) +
  geom_line() +
  geom_point() +
  geom_hline(aes(yintercept=0))

print(p1)

## check the model
p1 +
  geom_line(aes(y = pred1), color="red")

## extrapolate based on model
pred <- data.frame(Year=1990:2050)
pred$value <- predict(lm(value ~ poly(Year, 2), data=tmp),newdata=pred)

p1 +
  geom_line(color="red", data=pred)

In this case our model says the line will never cross zero. If that makes no sense then you'll want to pick a different model. Whatever model you pick, graph the result along with the data so you can see how well you're doing.

这篇关于在 R 中外推时间序列数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆