提取数据框中每个组中的最大值 [英] Extract the maximum value within each group in a dataframe

查看:88
本文介绍了提取数据框中每个组中的最大值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有分组变量( Gene)和值变量( Value)的数据框:

I have a data frame with a grouping variable ("Gene") and a value variable ("Value"):

Gene   Value
A      12
A      10
B      3
B      5
B      6
C      1
D      3
D      4

对于我的分组变量的每个级别,我希望提取最大值。因此,结果应该是每个级别的分组变量只有一行的数据框:

For each level of my grouping variable, I wish to extract the maximum value. The result should thus be a data frame with one row per level of the grouping variable:

Gene   Value
A      12
B      6
C      1
D      4

可以聚合有用吗?

推荐答案

在R中有很多可能性。其中有一些:

There are many possibilities to do this in R. Here are some of them:

df <- read.table(header = TRUE, text = 'Gene   Value
A      12
A      10
B      3
B      5
B      6
C      1
D      3
D      4')

# aggregate
aggregate(df$Value, by = list(df$Gene), max)
aggregate(Value ~ Gene, data = df, max)

# tapply
tapply(df$Value, df$Gene, max)

# split + lapply
lapply(split(df, df$Gene), function(y) max(y$Value))

# plyr
require(plyr)
ddply(df, .(Gene), summarise, Value = max(Value))

# dplyr
require(dplyr)
df %>% group_by(Gene) %>% summarise(Value = max(Value))

# data.table
require(data.table)
dt <- data.table(df)
dt[ , max(Value), by = Gene]

# doBy
require(doBy)
summaryBy(Value~Gene, data = df, FUN = max)

# sqldf
require(sqldf)
sqldf("select Gene, max(Value) as Value from df group by Gene", drv = 'SQLite')

# ave
df[as.logical(ave(df$Value, df$Gene, FUN = function(x) x == max(x))),]

这篇关于提取数据框中每个组中的最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆