将dplyr summarise_at与列索引一起使用 [英] Using dplyr summarise_at with column index
问题描述
我注意到,当向 dplyr :: summarize_at
提供列索引时,确定了要汇总的列,但不包括分组列。我不知道这是应该的,因为通过这种设计,使用正确的列索引取决于汇总列是位于分组列之前还是之后。
I noticed that when supplying column indices to dplyr::summarize_at
the column to be summarized is determined excluding the grouping column(s). I wonder if that is how it's supposed to be since by this design, using the correct column index depends on whether the summarising column(s) are positioned before or after the grouping columns.
这里是一个例子:
library(dplyr)
data("mtcars")
# grouping column after summarise columns
mtcars %>% group_by(gear) %>% summarise_at(3:4, mean)
## A tibble: 3 x 3
# gear disp hp
# <dbl> <dbl> <dbl>
#1 3 326.3000 176.1333
#2 4 123.0167 89.5000
#3 5 202.4800 195.6000
# grouping columns before summarise columns
mtcars %>% group_by(cyl) %>% summarise_at(3:4, mean)
## A tibble: 3 x 3
# cyl hp drat
# <dbl> <dbl> <dbl>
#1 4 82.63636 4.070909
#2 6 122.28571 3.585714
#3 8 209.21429 3.229286
# no grouping columns
mtcars %>% summarise_at(3:4, mean)
# disp hp
#1 230.7219 146.6875
# actual third & fourth columns
names(mtcars)[3:4]
#[1] "disp" "hp"
packageVersion("dplyr")
#[1] ‘0.7.2’
请注意汇总列如何根据分组列和分组列的位置而变化
Notice how the summarised columns change depending on grouping and position of the grouping column.
在其他平台上也一样吗?是错误还是功能?
Is this the same on other platforms? Is it a bug or a feature?
推荐答案
版本为 0.7.5
无法再复制此行为:
with version 0.7.5
this behavior can't be reproduced anymore:
library(dplyr)
mtcars %>% group_by(gear) %>% summarise_at(3:4, mean)
# # A tibble: 3 x 3
# gear disp hp
# <dbl> <dbl> <dbl>
# 1 3 326. 176.
# 2 4 123. 89.5
# 3 5 202. 196.
mtcars %>% group_by(cyl) %>% summarise_at(3:4, mean)
# # A tibble: 3 x 3
# cyl disp hp
# <dbl> <dbl> <dbl>
# 1 4 105. 82.6
# 2 6 183. 122.
# 3 8 353. 209.
这篇关于将dplyr summarise_at与列索引一起使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!