使用列索引代替group_by中的名称 [英] Use column index instead of name in group_by
问题描述
我想用dplyr总结一个数据框,就像这样:
I want to summarize a dataframe with dplyr, like so:
> test <-data.frame(ID = c("A", "A", "B", "B"), val = c(1:4))
> test %>% group_by(ID) %>% summarize(av = mean(val))
# A tibble: 2 x 2
ID av
<fctr> <dbl>
1 A 1.5
2 B 3.5
但是假设不是按分组无论名称如何,我都希望将第一列分组为 ID。有简单的方法吗?
But suppose that instead of grouping by the column called "ID" I wish to group by the first column, regardless of its name. Is there a simple way to do that?
我尝试了一些幼稚的方法( group_by(1)
, group_by(。[1])
, group_by(。,。[1])
, group_by(names(。)[1])
无济于事。我只是开始使用tidyverse软件包,所以我可能会遗漏一些明显的东西。
I've tried a few naive approaches (group_by(1)
, group_by(.[1])
, group_by(., .[1])
, group_by(names(.)[1])
to no avail. I'm only just beginning to use tidyverse packages so I may be missing something obvious.
这个问题非常相似,但是它是关于变异的,因此我无法将其概括为我的问题。这个问题也很相似,但可接受的答案是使用
This question is very similar, but it's about mutate and I wasn't able to generalize it to my problem. This question is also similar, but the accepted answer is to use a different package, and I'm trying to stick with dplyr.
推荐答案
您可以使用作用域之一变体( group_by_at
)为此:
You can use one of the scoped variants (group_by_at
) for this:
test %>% group_by_at(1) %>% summarise(av = mean(val))
# A tibble: 2 x 2
# ID av
# <fctr> <dbl>
#1 A 1.5
#2 B 3.5
这篇关于使用列索引代替group_by中的名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!