dplyr和for循环 [英] dplyr and for loop in r
问题描述
所以这里是问题:我想在我的R代码中使用一个for循环来总结不同的列。
作为一个例子,这里可以看起来像:
all.columns< -c(column4,column5,column6,column7)
for(i in 1:4){
df%>%
group_by(column3)%>%
summaryize(Mean = mean(all.columns [i]))
Max = max(all.columns [i]))
}
其中df是数据框,column3可以是Year变量,而列5到7,我想用相同的代码反复检查。
您是否知道如何使用dplyr执行此操作?如果你没有dplyr的替代方案,我想听听。
我试图放置列的字符名称,但它不工作..
如何做:
假数据: p>
df< - data.frame(column3 = rep(letters [1:2],10),
/ pre>
column4 = rnorm(20),
column5 = rnorm(20),
column6 = rnorm(20),
column7 = rnorm(20))
dplyr
解决方案:library(dplyr)
df%>%
group_by(column3)%>%
summarise_each(funs(mean,max),column4:column7)
输出:
资料来源:本地资料框架[2 x 9]
column3 column4_mean column5_mean column6_mean column7_mean column4_max column5_max
1 a 0.186458 0.02662053 -0.00874544 0.3327999 1.563171 2.416 697
2 b 0.336329 -0.08868817 0.31777871 0.1934266 1.263437 1.142430
未显示变量:column6_max(dbl),column7_max(dbl)
So here is the problem: I want to use a for loop in my R code to summarize different columns.
As an example, here what it could look like:
all.columns<-c("column4","column5","column6","column7") for (i in 1:4) { df%>% group_by(column3)%>% summarise(Mean=mean(all.columns[i]), Max=max(all.columns[i])) }
Where df is a data frame, column3 could be a group by Year variable, and columns 5 to 7 the ones that I want to check repeatedly with the same code.
Do you know how to execute this with dplyr ? If you an alternative without dplyr, I'd like to hear about it.
I've tried to put the character name of the column, but it's not working...
解决方案How about this:
Fake data:
df <- data.frame(column3=rep(letters[1:2], 10), column4=rnorm(20), column5=rnorm(20), column6=rnorm(20), column7=rnorm(20))
dplyr
solution:library(dplyr) df %>% group_by(column3) %>% summarise_each(funs(mean, max), column4:column7)
Output:
Source: local data frame [2 x 9] column3 column4_mean column5_mean column6_mean column7_mean column4_max column5_max 1 a 0.186458 0.02662053 -0.00874544 0.3327999 1.563171 2.416697 2 b 0.336329 -0.08868817 0.31777871 0.1934266 1.263437 1.142430 Variables not shown: column6_max (dbl), column7_max (dbl)
这篇关于dplyr和for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!