dplyr和for循环 [英] dplyr and for loop in r

查看:116
本文介绍了dplyr和for循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以这里是问题:我想在我的R代码中使用一个for循环来总结不同的列。



作为一个例子,这里可以看起来像:

  all.columns< -c(column4,column5,column6,column7)
for(i in 1:4){
df%>%
group_by(column3)%>%
summaryize(Mean = mean(all.columns [i]))
Max = max(all.columns [i]))
}

其中df是数据框,column3可以是Year变量,而列5到7,我想用相同的代码反复检查。



您是否知道如何使用dplyr执行此操作?如果你没有dplyr的替代方案,我想听听。



我试图放置列的字符名称,但它不工作..

解决方案

如何做:



假数据: p>

  df<  -  data.frame(column3 = rep(letters [1:2],10),
column4 = rnorm(20),
column5 = rnorm(20),
column6 = rnorm(20),
column7 = rnorm(20))
/ pre>

dplyr 解决方案:

  library(dplyr)
df%>%
group_by(column3)%>%
summarise_each(funs(mean,max),column4:column7)

输出:

 资料来源:本地资料框架[2 x 9] 

column3 column4_mean column5_mean column6_mean column7_mean column4_max column5_max
1 a 0.186458 0.02662053 -0.00874544 0.3327999 1.563171 2.416 697
2 b 0.336329 -0.08868817 0.31777871 0.1934266 1.263437 1.142430
未显示变量:column6_max(dbl),column7_max(dbl)


So here is the problem: I want to use a for loop in my R code to summarize different columns.

As an example, here what it could look like:

all.columns<-c("column4","column5","column6","column7")
for (i in 1:4) {  
df%>%
 group_by(column3)%>%
 summarise(Mean=mean(all.columns[i]),
           Max=max(all.columns[i]))
} 

Where df is a data frame, column3 could be a group by Year variable, and columns 5 to 7 the ones that I want to check repeatedly with the same code.

Do you know how to execute this with dplyr ? If you an alternative without dplyr, I'd like to hear about it.

I've tried to put the character name of the column, but it's not working...

解决方案

How about this:

Fake data:

df <- data.frame(column3=rep(letters[1:2], 10), 
                 column4=rnorm(20),
                 column5=rnorm(20),
                 column6=rnorm(20),
                 column7=rnorm(20))

dplyr solution:

library(dplyr)
df %>% 
  group_by(column3) %>% 
  summarise_each(funs(mean, max), column4:column7)

Output:

Source: local data frame [2 x 9]

  column3 column4_mean column5_mean column6_mean column7_mean column4_max column5_max
1       a     0.186458   0.02662053  -0.00874544    0.3327999    1.563171    2.416697
2       b     0.336329  -0.08868817   0.31777871    0.1934266    1.263437    1.142430
Variables not shown: column6_max (dbl), column7_max (dbl)

这篇关于dplyr和for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆