如何使用dplyr将函数应用于所有非group_by列? [英] How can I use dplyr to apply a function to all non-group_by columns?
问题描述
我正在尝试使用dplyr软件包将一个函数应用于未被分组的data.frame中的所有列,我将使用 aggregate()
:
聚合(。〜Species,data = iris,mean)
/ pre>
其中
意味着
应用于不用于分组的所有列。 (是的,我知道我可以使用聚合,但是我正在理解dplyr。)
我可以使用
总结
如下:种类< - group_by(虹膜,物种)
总结(种类,
Sepal.Length = mean(Sepal.Length),
Sepal.Width = mean(Sepal.Width))
但是有一种方法可以将
mean()
应用于未分组的所有列,类似于。 〜
符号aggregate()
?我有一个包含30列的数据框架,我想汇总,所以写出各个语句是不理想的。解决方案如果你愿意尝试一个实验性的dplyr,你可以尝试
新的(仍然是实验性的)summarise_each()
:devtools :: install_github(hadley / dplyr,ref =colwise)
库(dplyr)
iris%。%
group_by(Species)%。%
summarise_each(funs(mean))
##来源:本地数据框[3 x 5]
##
##物种Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1 setosa 5.006 3.428 1.462 0.246
## 2 versicolor 5.936 2.770 4.260 1.326
## 3 virginica 6.588 2.974 5.552 2.026
iris%。%
group_by(Species)%。%
summarise_each(funs(min,max))
##来源:本地数据框架[3 x 9]
##
##物种Sepal.Length _min Sepal.Width_min Petal.Length_min
## 1 setosa 4.3 2.3 1.0
## 2 versicolor 4.9 2.0 3.0
## 3 virginica 4.9 2.2 4.5
##未显示的变量: Petal.Width_min(dbl),Sepal.Length_max(dbl),
## Sepal.Width_max(dbl),Petal.Length_max(dbl),Petal.Width_max(dbl)
反馈非常感谢!
这将出现在dplyr 0.2。
I'm trying to use the dplyr package to apply a function to all columns in a data.frame that are not being grouped, which I would do with
aggregate()
:aggregate(. ~ Species, data = iris, mean)
where
mean
is applied to all columns not used for grouping. (Yes, I know I can use aggregate, but I'm trying to understand dplyr.)I can use
summarize
like this:species <- group_by(iris, Species) summarize(species, Sepal.Length = mean(Sepal.Length), Sepal.Width = mean(Sepal.Width))
But is there a way to have
mean()
applied to all columns that are not grouped, similar to the. ~
notation ofaggregate()
? I have a data.frame with 30 columns that I want to aggregate, so writing out the individual statements is not ideal.解决方案If you're willing to try out an experimental dplyr, you can try out the new (and still experimental)
summarise_each()
:devtools::install_github("hadley/dplyr", ref = "colwise") library(dplyr) iris %.% group_by(Species) %.% summarise_each(funs(mean)) ## Source: local data frame [3 x 5] ## ## Species Sepal.Length Sepal.Width Petal.Length Petal.Width ## 1 setosa 5.006 3.428 1.462 0.246 ## 2 versicolor 5.936 2.770 4.260 1.326 ## 3 virginica 6.588 2.974 5.552 2.026 iris %.% group_by(Species) %.% summarise_each(funs(min, max)) ## Source: local data frame [3 x 9] ## ## Species Sepal.Length_min Sepal.Width_min Petal.Length_min ## 1 setosa 4.3 2.3 1.0 ## 2 versicolor 4.9 2.0 3.0 ## 3 virginica 4.9 2.2 4.5 ## Variables not shown: Petal.Width_min (dbl), Sepal.Length_max (dbl), ## Sepal.Width_max (dbl), Petal.Length_max (dbl), Petal.Width_max (dbl)
Feedback much appreciated!
This will appear in dplyr 0.2.
这篇关于如何使用dplyr将函数应用于所有非group_by列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!