R:使用两个或多个for循环提取回归结果 [英] R: extract regression results using two or more for loops
问题描述
for
循环循环遍历df中的所有回归组合: all_lm< -data.frame(matrix(nrow = 180,ncol = 9))
names(all_lm)= c(col1,col2,Estimate,Std。Error,z value pValue,2.5%,97.5%,r ^ 2)
和(A,B,C))($ c
pre code> (b)(b)(b)(b)(b)(b)(c) ,H)){
形式< - 公式(paste0(i,_PC_AB_,k,〜,l))
结果< -lm子集= Decile == j)
all_lm [i,1] <-i
all_lm [i,2] <-j
all_lm [i,3] coef(summary(result))[2,1],3)
all_lm [i,4] <-round(coef(summary(result))[2,2],3)
all_lm [i,5]< -round(coef(summary(result))[2,3],3)
all_lm [i,6]< -rou nd(coef(summary(result))[2,4],3)
all_lm [i,7] <-round(confint(result)[2,1],2)
all_lm [ i,8]< -round(confint(result)[2,2],2)
all_lm [i,9]< -round(summary(result)$ r.squared,3)
这个循环配置在我用它来导出 Cairo
,但是我意识到 all_lm [i,n]
是一个不正确的方法。我不太了解R来解决这个问题。我尝试过各种组合,例如 all_lm [i,j,k,n]
。我也尝试过{之后的
,但是这不起作用。我怎么能通过180回归循环,并将结果存储在我的矩阵?
大多数时间在R,如果你(被称为使用 for
循环(更不用说嵌套for循环了)),你可能是在错误的轨道上。
解决问题的一般方法是使用 expand.grid
函数创建所有输入的组合,然后使用 mapply
在每个输入组合上重复回归并返回一个结果列表,然后使用 do.call
将结果列表组合成数据框。
你的代码应该是这样的:
i< - c('A','B','C')
j < - 1:10
k < - c('D','E')
l < - c ('F','G','H')
params< - expand.grid(i,j,k,l,stringsAsFactors = FALSE)
$ c $你现在有一个所有输入组合的数据框。
$ p
$ p $ >头(params)
Var1 Var2 Var3 Var4
1 A 1 DF
2 B 1 DF
3 C 1 DF
4 A 2 DF
5 B 2 DF
6 C 2 DF
>尾巴(params)
Var1 Var2 Var3 Var4
175 A 9 EH
176 B 9 EH
177 C 9 EH
178 A 10 EH
179 B 10 EH
180 C 10 EH
现在设置一个函数 mapply
将用于params数据帧的每一行。
#
one_lm < - 函数(i,j,k,l){
形式< - 公式(paste0(i,_PC_AB_,k,〜,l))
result -lm(form,data = schools,subset = Decile == j)
列表(
col1 = i,
col2 = j,
estimate = round(coef(summary(result))[2,1],3),
std_err = round(coef(summary(result))[2,2],3),
z_value = round(coef(summary(result))[2,3],3),
(结果)[2,1],2),
pct_97.5 = round(confint(result)[2,2],2),
r_square = round(summary(result)$ r.squared,3)
)
} $现在使用 mapply
来处理每个组合一个时间,并返回每一行的估计,std_err等的列表。
result_list< - mapply(one_lm,params [,1],params [,2],params [,3],params [,4],SIMPLIFY = FALSE)
然后,您可以使用 do.call
和 rbind 将所有这些列表合并到一个数据框中。 code>一起工作。
结果< - do.call(rbind,result_list)
Based on this post, I created the following matrix and for
loops to loop through all regression combinations in my df:
all_lm <-data.frame(matrix(nrow=180, ncol=9))
names(all_lm)=c("col1", "col2", "Estimate", " Std. Error", " z value", " pValue", "2.5%", "97.5%", "r^2")
and to save the results, this:
for (i in c("A","B","C"))
for (j in c(1:10))
for (k in c("D","E"))
for (l in c("F", "G", "H")){
form <- formula(paste0(i,"_PC_AB_",k, " ~ ", l))
result<-lm(form, data = schools, subset=Decile==j)
all_lm[i,1]<-i
all_lm[i,2]<-j
all_lm[i,3]<-round(coef(summary(result))[2,1],3)
all_lm[i,4]<-round(coef(summary(result))[2,2],3)
all_lm[i,5]<-round(coef(summary(result))[2,3],3)
all_lm[i,6]<-round(coef(summary(result))[2,4],3)
all_lm[i,7]<-round(confint(result)[2,1],2)
all_lm[i,8]<-round(confint(result)[2,2],2)
all_lm[i,9]<-round(summary(result)$r.squared, 3)
}
This loop configuration works when I use it to export plots in Cairo
, but I realise that the all_lm[i,n]
is an incorrect approach. I do not know enough about R to solve this. I've tried various combinations such as all_lm[i,j,k,n]
. I have also tried { after each for
but this did not work. How can i loop through the 180 regressions and store the results in my matrix?
解决方案 Most of the time in R, if you're being drawn to using a for
loop (let alone nested for loops), you're probably on the wrong track.
The general approach to solving your problem is to use the expand.grid
function to create all combinations of the inputs, then use mapply
to repeatedly regress on each combination of inputs and return a list of results, then use do.call
to combine the list of results into a data frame.
Your code should look something like this:
i <- c('A','B','C')
j <- 1:10
k <- c('D','E')
l <- c('F','G','H')
params <- expand.grid(i, j, k, l, stringsAsFactors = FALSE)
You now have a data frame of all combinations of inputs.
> head(params)
Var1 Var2 Var3 Var4
1 A 1 D F
2 B 1 D F
3 C 1 D F
4 A 2 D F
5 B 2 D F
6 C 2 D F
> tail(params)
Var1 Var2 Var3 Var4
175 A 9 E H
176 B 9 E H
177 C 9 E H
178 A 10 E H
179 B 10 E H
180 C 10 E H
Now set up a function that mapply
will use for each row of the params data frame.
#
one_lm <- function(i, j, k, l) {
form <- formula(paste0(i,"_PC_AB_",k, " ~ ", l))
result <- lm(form, data = schools, subset=Decile==j)
list(
col1 = i,
col2 = j,
estimate = round(coef(summary(result))[2,1],3),
std_err = round(coef(summary(result))[2,2],3),
z_value = round(coef(summary(result))[2,3],3),
p_value = round(coef(summary(result))[2,4],3),
pct_2.5 = round(confint(result)[2,1],2),
pct_97.5 = round(confint(result)[2,2],2),
r_square = round(summary(result)$r.squared, 3)
)
}
Now use mapply
to process each combination one at a time, and return a list of estimates, std_err, etc for each row.
result_list <- mapply(one_lm, params[,1], params[,2], params[,3], params[,4], SIMPLIFY = FALSE)
You can then combine all those lists into a data frame using the the do.call
and rbind
functions together.
results <- do.call(rbind, result_list)
这篇关于R:使用两个或多个for循环提取回归结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!