data.table总和按组和返回行与最大值 [英] data.table sum by group and return row with max value
问题描述
我有一个data.table以这种方式:
dd< - data.table(f = c a,a,a,b,b),g = c(1,2,3,4,5))
dd
我需要将 g
乘以 f
,最后返回一个具有
g
最大值的单行data.table对象,但也包含因子信息。即
___ f | g
1:b 9
我最近的尝试是
tmp3& [,sum(g),by = f] [,max(V1)]
tmp3
其结果是:
> tmp3
[1] 9
编辑:我理想地寻找纯数据。表块代码/工作流。我惊讶的是,所有快速的快速拆分应用组合的魔法和能力以'example [i = subset,]'的形式子集你的数据,我没有找到一个简单的方法来子集一个
这里有一种方法:
library(data.table)
/ pre>
dd< - data.table(
f = c(a,a,a,b b),
g = c(1,2,3,4,5))
##
> dd [,list(g = sum(g)),by = f] [which.max(g),]
fg
1:b 9
I have a data.table in this fashion:
dd <- data.table(f = c("a", "a", "a", "b", "b"), g = c(1,2,3,4,5)) dd
I need to sum the values
g
by factorf
, and finally return a single row data.table object that has the maximum value ofg
, but that also contains the factor information. i.e.___f|g 1: b 9
My closest attempt so far is
tmp3 <- dd[, sum(g), by = f][, max(V1)] tmp3
Which results in:
> tmp3 [1] 9
EDIT: I'm ideally looking for a purely data.table piece of code/workflow. I'm surprised that with all the speedy fast split-apply-combine wizardry and ability to subset your data in the form of 'example[i= subset, ]` that I haven't found a straight forward way to subset on a single value condition.
解决方案Here's one way to do it:
library(data.table) dd <- data.table( f = c("a", "a", "a", "b", "b"), g = c(1,2,3,4,5)) ## > dd[,list(g = sum(g)),by=f][which.max(g),] f g 1: b 9
这篇关于data.table总和按组和返回行与最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!