使用多个不同的group_by变量(dplyr)汇总数据框 [英] Using multiple different group_by variables (dplyr) to summarise a dataframe
本文介绍了使用多个不同的group_by变量(dplyr)汇总数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框 my_data,其中包含6列:
I have a dataframe "my_data" which contains 6 columns:
group1.members group2.members group3.members price price.2 price.3
1 1 1 800 877 334
1 2 1 850 877 334
2 2 1 859 877 334
3 1 1 859 877 334
3 1 2 870 877 334
2 2 2 870 877 334
2 3 2 870 877 334
1 3 3 880 877 334
我想通过ROW将my_data的 price列汇总为几个单独的数据帧,并在每个group.by上使用不同的 group.member列。但是,似乎group_by不允许这样做?
I would like to summarise by ROW, the "price" columns of my_data into several separate dataframes, using group_by on a different "group.member" column for each. It seems though, that group_by does not allow this?
这就是我的想法:
my_data <- as.data.frame(data)
num_of_years <- c(1,2,3)
for(i in 1:length(num_of_years)){
price_means <- my_data %>% group_by(my_data[i]) %>%
select(-value) %>%
summarise_each(funs(mean(., na.rm=TRUE))) %>%
ungroup
assign(paste("PriceMeans",i,sep=""),price_means, envir = .GlobalEnv)
}
换句话说:
- 对于i = 1,使用group_by(group1.members)
- 对于i = 2,使用group_by(group2.members)
- 对于i = 3,使用group_by(group3。成员)
编辑:我的解决方案如下:
MY SOLUTION BELOW:
for(i in 1:length(my_groups)){
# construct the group to select
current.group <- my_groups[i]
current.group <- paste0("memb_", current.group)
# construct the groups to exclude
groups.to.drop <- my_groups[-i]
groups.to.drop <- paste0("memb_", groups.to.drop)
# Get Means
Means <- data %>% group_by_(as.name(current.group)) %>%
select(- c(ID, get(groups.to.drop))) %>%
summarise_each(funs(mean(., na.rm = TRUE)))
Means <- Means[,-1:-(length(my_groups)-1)]
Means <- as.list(Means)
assign(x = paste0("Means_",i),
value = Means,
envir = parent.env(new.env())
}
推荐答案
我绝不是 dplyr
专家,但这似乎完成了您要执行的操作:
I am by no means a dplyr
expert but this seems to accomplish what you are trying to do:
for (i in 1:length(num_of_years)){
var1 <- names(my_data)[[i]]
var2 <- c(var1)
price_means <- my_data %>%
select(eval(i), price, price.2, price.3) %>%
group_by_(var2) %>%
summarise_each(funs(mean(., na.rm=TRUE))) %>%
ungroup()
assign(paste("PriceMeans",i,sep=""),price_means, envir = .GlobalEnv)
}
这篇关于使用多个不同的group_by变量(dplyr)汇总数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文