按多列分组并对其他多列求和 [英] Group by multiple columns and sum other multiple columns
问题描述
我有一个包含大约 200 列的数据框,其中我想按前 10 个左右的因素对表格进行分组,并对其余列求和.
I have a data frame with about 200 columns, out of them I want to group the table by first 10 or so which are factors and sum the rest of the columns.
我有我想要分组的所有列名的列表以及我想要聚合的所有列的列表.
I have list of all the column names which I want to group by and the list of all the cols which I want to aggregate.
我正在寻找的输出格式需要是具有相同列数的相同数据帧,只是组合在一起.
The output format that I am looking for needs to be the same dataframe with same number of cols, just grouped together.
是否有使用包 data.table
、plyr
或任何其他包的解决方案?
Is there a solution using packages data.table
, plyr
or any other?
推荐答案
data.table 方式是:
DT[, lapply(.SD,sum), by=list(col1,col2,col3,...)]
或
DT[, lapply(.SD,sum), by=colnames(DT)[1:10]]
其中 .SD
是 (D)data 排除 组列的 (S) 子集.(另外:如果您需要一般地引用组列,它们在 .BY
中.)
where .SD
is the (S)ubset of (D)ata excluding group columns. (Aside: If you need to refer to group columns generically, they are in .BY
.)
这篇关于按多列分组并对其他多列求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!