R-合并R中数据框内的重复行: [英] R - Combining duplicate rows within dataframe in R :
问题描述
我有一个数据框,如下所示:请注意COL1
具有重复的条目
I have a dataframe as below: please note that COL1
is having duplicate entries
COL1 COL2 COL3
10 hai 2
10 hai 3
10 pal 1
我希望输出如下所示:即COL1
应该具有唯一的条目(10),COL2
应该包含其下没有重复项的合并条目(hai pal),并且COL3
应该包含条目的总和(2 + 3 + 1 = 6)
I want the output to be like this as shown below: i.e COL1
should have the unique entry alone(10), COL2
should contain the merged entries under it without duplicates(hai pal), and COL3
should contain the sum of entries(2+3+1=6)
输出:
COL1 COL2 COL3
10 hai pal 6
推荐答案
也许我们需要按组进行汇总.将'data.frame'转换为'data.table'(setDT(df1
),按'COL1',paste
'COL2'中的unique
元素分组,并获得'COL3'的sum
Perhaps we need to aggregate by group. Convert the 'data.frame' to 'data.table' (setDT(df1
), grouped by 'COL1', paste
the unique
elements in 'COL2' together as well as get the sum
of 'COL3'.
library(data.table)
setDT(df1)[,.(COL2 = paste(unique(COL2), collapse=" "), COL3= sum(COL3)) , by = COL1]
# COL1 COL2 COL3
#1: 10 hai pal 6
这篇关于R-合并R中数据框内的重复行:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!