在多列数据上使用rollapply和lm [英] Using rollapply and lm over multiple columns of data
问题描述
我有一个类似于以下的数据框,共有500列:
I have a data frame similar to the following with a total of 500 columns:
Probes <- data.frame(Days=seq(0.01, 4.91, 0.01), B1=5:495,B2=-100:390, B3=10:500,B4=-200:290)
我想计算一个滚动窗口线性回归,其中我的窗口大小为12个数据点,而每个顺序回归都由6个数据点分隔.对于每次回归,天"将始终是模型的x分量,而y将是其他所有列(B1,B2,B3等).然后,我想将系数与现有列标题(B1,B2等)一起保存为数据框.
I would like to calculate a rolling window linear regression where my window size is 12 data points and each sequential regression is separated by 6 data points. For each regression, "Days" will always be the x component of the model, and the y's would be each of the other columns (B1, followed by B2, B3, etc). I would then like to save the co-efficients as a dataframe with the existing column titles (B1, B2, etc).
我认为我的代码很接近,但是不能正常工作.我使用了来自动物园图书馆的rollapply.
I think my code is close, but is not quite working. I used rollapply from the zoo library.
slopedata<-rollapply(zoo(Probes), width=12, function(Probes) {
coef(lm(formula=y~Probes$Days, data = Probes))[2]
}, by = 6, by.column=TRUE, align="right")
如果可能的话,我也想将"xmins"保存到向量中以添加到数据帧中.这意味着每次回归中使用的最小x值(基本上是天"列中的每6个数字.) 感谢您的帮助.
If possible, I would also like to have the "xmins" saved to a vector to add to the dataframe. This would mean the smallest x value used in each regression (basically it would be every 6 numbers in the "Days" column.) Thanks for your help.
推荐答案
1)定义一个动物园对象z
,其数据包含Probes
,并且其索引来自探针"的第一列,即Days
.注意lm
允许y
作为矩阵,定义了coefs
函数,该函数计算回归系数.最后是rollapply
而不是z
.请注意,返回对象的索引为xmin.
1) Define a zoo object z
whose data contains Probes
and whose index is taken from the first column of Probes, i.e. Days
. Noting that lm
allows y
to be a matrix define a coefs
function which computes the regression coefficients. Finally rollapply
over z
. Note that the index of the returned object gives xmin.
library(zoo)
z <- zoo(Probes, Probes[[1]])
coefs <- function(z) c(unlist(as.data.frame(coef(lm(z[,-1] ~ z[,1])))))
rz <- rollapply(z, 12, by = 6, coefs, by.column = FALSE, align = "left")
给予:
> head(rz)
B11 B12 B21 B22 B31 B32 B41 B42
0.01 4 100 -101 100 9 100 -201 100
0.07 4 100 -101 100 9 100 -201 100
0.13 4 100 -101 100 9 100 -201 100
0.19 4 100 -101 100 9 100 -201 100
0.25 4 100 -101 100 9 100 -201 100
0.31 4 100 -101 100 9 100 -201 100
请注意,如果您需要数据帧表示形式为rz
,则可以使用DF <- fortify.zoo(rz)
.
Note that DF <- fortify.zoo(rz)
could be used if you needed a data frame representation of rz
.
2)另一种类似的方法是对行号rollaplly
:
2) An alternative somewhat similar approch would be to rollaplly
over the row numbers:
library(zoo)
y <- as.matrix(Probes[-1])
Days <- Probes$Days
n <- nrow(Probes)
coefs <- function(ix) c(unlist(as.data.frame(coef(lm(y ~ Days, subset = ix)))),
xmins = Days[ix][1])
r <- rollapply(1:n, 12, by = 6, coefs)
这篇关于在多列数据上使用rollapply和lm的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!