在数据框中跨不同对象查找范围内的所有最大值 [英] Find all largest values in a range, across different objects in data frame
问题描述
对于以下情况,我想知道是否有比编写if ... else ...更简单的方法.我有一个数据框,我只希望列百分比"> = 95中具有数字的行.而且,对于一个对象,如果有多行符合此条件,我只想要最大的行.如果有一个以上的最大对象,我希望保留所有这些对象.
I wonder if there is an simpler way than writing if...else... for the following case. I have a dataframe and I only want the rows with number in column "percentage" >=95. Moreover, for one object, if there is multiple rows fitting this criteria, I only want the largest one(s). If there are more than one largest ones, I would like to keep all of them.
例如:
object city street percentage
A NY Sun 100
A NY Malino 97
A NY Waterfall 100
B CA Washington 98
B WA Lieber 95
C NA Moon 75
然后我希望结果显示:
object city street percentage
A NY Sun 100
A NY Waterfall 100
B CA Washington 98
我可以使用if else语句来执行此操作,但是我觉得应该有一些更聪明的说法:1.> = 95 2.如果不止一个,请选择最大的3.如果不止一个,请选择他们全部.
I am able to do it using if else statement, but I feel there should be some smarter ways to say: 1. >=95 2. if more than one, choose the largest 3. if more than one largest, choose them all.
推荐答案
您可以通过创建一个变量来做到这一点,该变量指示每个对象中具有最大百分比的行.然后,我们可以使用该指标对数据进行子集化.
You can do this by creating an variable that indicates the rows that have the maximum percentage for each of the objects. We can then use this indicator to subset the data.
# your data
dat <- read.table(text = "object city street percentage
A NY Sun 100
A NY Malino 97
A NY Waterfall 100
B CA Washington 98
B WA Lieber 95
C NA Moon 75", header=TRUE, na.strings="", stringsAsFactors=FALSE)
# create an indicator to identify the rows that have the maximum
# percentage by object
id <- with(dat, ave(percentage, object, FUN=function(i) i==max(i)) )
# subset your data - keep rows that are greater than 95 and have the
# maximum group percentage (given by id equal to one)
dat[dat$percentage >= 95 & id , ]
这通过添加语句创建逻辑来起作用,然后可以使用该逻辑对dat的行进行子集化.
This works by the addition statement creating a logical, which can then be used to subset the rows of dat.
dat$percentage >= 95 & id
#[1] TRUE FALSE TRUE TRUE FALSE FALSE
或将它们放在一起
with(dat, dat[percentage >= 95 & ave(percentage, object,
FUN=function(i) i==max(i)) , ])
# object city street percentage
# 1 A NY Sun 100
# 3 A NY Waterfall 100
# 4 B CA Washington 98
这篇关于在数据框中跨不同对象查找范围内的所有最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!