GROUP_CONCAT与dplyr或R [英] GROUP_CONCAT with dplyr or R
问题描述
我很难在dplyr中复制典型SQL GROUP_CONCAT函数的功能.我还要确保可以控制组内的顺序.理想情况下,我想使用hadleyverse/tidyverse,但是base R或其他软件包也可以使用.
I am having difficulty replicating the functionality of a typical SQL GROUP_CONCAT function in dplyr. I would also like to make sure the ordering inside the groups can be controlled. Ideally I want to use the hadleyverse/tidyverse but base R or other packages will work too.
示例数据:
ID name
1 apple
1 orange
2 orange
3 orange
3 apple
所需的输出:
ID name
1 apple,orange
2 orange
3 apple,orange
请注意,对于ID = 3,顺序是字母顺序,而不是行的顺序.我认为可以先进行 arrange
处理,但是最好在 summarise
语句之内进行控制.
Note that for ID=3, the ordering is in alpha order, not how the rows are ordered. I think this can probably be handled by doing an arrange
first, but it would be nice to control inside the summarise
statement or the like.
推荐答案
在 R
中,我们可以使用group by操作之一.
In R
, we can use one of the group by operations.
library(dplyr)
df1 %>%
group_by(ID) %>%
summarise(name = toString(sort(unique(name))))
# ID name
# <int> <chr>
#1 1 apple, orange
#2 2 orange
#3 3 apple, orange
或使用 data.table
library(data.table)
setDT(df1)[, .(name = toString(sort(unique(name)))), by = ID]
# ID name
#1: 1 apple, orange
#2: 2 orange
#3: 3 apple, orange
这篇关于GROUP_CONCAT与dplyr或R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!