线性回归并将结果存储在数据框中 [英] Linear Regression and storing results in data frame
问题描述
我正在对数据框中的某些变量进行线性回归.我希望能够通过分类变量对线性回归进行子集化,为每个分类变量运行线性回归,然后将t统计量存储在数据框中.如果可能的话,我希望没有循环地进行此操作.
I am running a linear regression on some variables in a data frame. I'd like to be able to subset the linear regressions by a categorical variable, run the linear regression for each categorical variable, and then store the t-stats in a data frame. I'd like to do this without a loop if possible.
这是我要做的事的一个示例:
Here's a sample of what I'm trying to do:
a<- c("a","a","a","a","a",
"b","b","b","b","b",
"c","c","c","c","c")
b<- c(0.1,0.2,0.3,0.2,0.3,
0.1,0.2,0.3,0.2,0.3,
0.1,0.2,0.3,0.2,0.3)
c<- c(0.2,0.1,0.3,0.2,0.4,
0.2,0.5,0.2,0.1,0.2,
0.4,0.2,0.4,0.6,0.8)
cbind(a,b,c)
我可以通过运行以下线性回归并非常容易地得出t统计量来开始:
I can begin by running the following linear regression and pulling the t-statistic out very easily:
summary(lm(b~c))$coefficients[2,3]
但是,我希望能够在a列是a,b或c时运行回归.然后,我想将t-stats存储在一个看起来像这样的表中:
However, I'd like to be able to run the regression for when column a is a, b, or c. I'd like to then store the t-stats in a table that looks like this:
variable t-stat
a 0.9
b 2.4
c 1.1
希望如此.如果您有任何建议,请告诉我!
Hope that makes sense. Please let me know if you have any suggestions!
推荐答案
以下是对plyr
软件包和ddply()
的投票.
Here's a vote for the plyr
package and ddply()
.
plyrFunc <- function(x){
mod <- lm(b~c, data = x)
return(summary(mod)$coefficients[2,3])
}
tStats <- ddply(dF, .(a), plyrFunc)
tStats
a V1
1 a 1.6124515
2 b -0.1369306
3 c 0.6852483
这篇关于线性回归并将结果存储在数据框中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!