R:具有rollapply和ddply的分组滚动窗口线性回归 [英] R: Grouped rolling window linear regression with rollapply and ddply

查看:260
本文介绍了R:具有rollapply和ddply的分组滚动窗口线性回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含几个分组变量的数据集,我想在该数据集上运行滚动窗口线性回归.最终目标是提取斜率最低的10个线性回归并将其平均,以提供平均最小变化率.我已经找到了使用rollapply计算滚动窗口线性回归的示例,但是我要增加一些复杂性,我想将这些线性回归应用于数据集内的组.

I have a data set with several grouping variables on which I want to run a rolling window linear regression. The ultimate goals is to extract the 10 linear regressions with the lowest slopes and average them together to provide a mean minimum rate of change. I have found examples using rollapply to calculate rolling window linear regressions, but I have the added complication that I would like to apply these linear regressions to groups within the data set.

这是一个示例数据集,而我当前的代码已接近并且无法正常运行.

Here is a sample data set and my current code which is close and isn't quite working.

dat<-data.frame(w=c(rep(1,27), rep(2,27),rep(3,27)), z=c(rep(c(1,2,3),27)), 
x=c(rep(seq(1,27),3)), y=c(rnorm(27,10,3), rnorm(27,3,2.2), rnorm(27, 6,1.3)))

其中w和z是两个分组变量,x和y是回归项.

where w and z are two grouping variables and x and y are the regression terms.

在我的互联网搜索中,这是一个基本的滚动窗口线性回归代码,其中窗口大小为6,连续回归由3个数据点分隔,我仅提取斜率coef(lm ...)[2]

From my internet searches here is aR basic rolling window linear regression code where the window size is 6, sequential regressions are separated by 3 data points and I am extracting only the slope coef(lm...)[2]

library(zoo)    
slopeData<-rollapply(zoo(dat), width=6, function(Z) { 
coef(lm(formula=y~x, data = as.data.frame(Z), na.rm=T))[2]
}, by = 3, by.column=FALSE, align="right")

现在,我希望将此滚动窗口回归应用于由两个分组变量w和z指定的组.所以我尝试使用plyr包中的ddply进行类似的操作.首先,我尝试将上面的代码重写为一个函数.

Now I wish to apply this rolling window regression to the groups specified by the two grouping variables w and z. So I tried something like this using ddply from plyr package. First I try to rewrite the code above as a function.

rolled<-function(df) {
    rollapply(zoo(df), width=6, function(Z) { 
    coef(lm(formula=y~x, data = as.data.frame(Z), na.rm=T))[2]
    }, by = 3, by.column=FALSE, align="right")
}

然后使用ddply运行应用该功能

And then run apply that function using ddply

groupedSlope <- ddply(dat, .(w,z), function(d) rolled(d))

但是,这不起作用,因为我收到了一系列警告和错误.我认为某些错误可能与动物园格式和数据帧的组合有关,并且这变得过于复杂.到目前为止,我一直在努力 有谁知道一种获得分组,滚动窗口线性回归的方法的方法,该方法可能比此方法更简单?

This, however, doesn't work as I get a series of warnings and errors. I imagine that some of the errors may relate to the combining of zoo formats and data frames and this becomes overly complicated. Its what I have been working on so far, but does anyone know of a means of getting grouped, rolling window linear regressions, potentially simpler than this method?

感谢您的协助, 内特

推荐答案

1)rollapply也适用于数据帧,因此不必将df转换为zoo.

1) rollapply works on data frames too so it is not necessary to convert df to zoo.

2)lm使用na.action,而不是na.rm,并且默认值为na.omit,因此我们可以删除此参数.

2) lm uses na.action, not na.rm, and its default is na.omit so we can just drop this argument.

3)rollapplyr是编写rollapply(..., align = "right")的更简洁的方法.

3) rollapplyr is a more concise way to write rollapply(..., align = "right").

假设rolled可以满足您的要求,并将这些更改合并到rolled中,则问题中的ddply语句应该可以工作,或者我们可以使用下面显示的R的by: /p>

Assuming that rolled otherwise does what you want and incorporating these changes into rolled, the ddply statement in the question should work or we could use by from the base of R which we show below:

rolled <- function(df) {
    rollapplyr(df, width = 6, function(m) { 
          coef(lm(formula = y ~ x, data = as.data.frame(m)))[2]
       }, by = 3, by.column = FALSE
   )
}
do.call("rbind", by(dat, dat[c("w", "z")], rolled))

这篇关于R:具有rollapply和ddply的分组滚动窗口线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆