根据多种条件过滤组内的行 [英] Filter rows within groups based on multiple conditions
问题描述
我有一个数据集,我想在其中过滤不同组中的行.
I have a data set where I would like to filter rows within different groups.
给出此数据框:
group = as.factor(c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3))
fruit = as.factor(c("apples", "apples", "apples", "oranges",
"oranges", "apples", "oranges",
"bananas", "bananas", "oranges", "bananas"))
hit = c(1, 0, 1, 1,
0, 1, 1,
1, 0, 0, 1)
dt = data.frame(group, fruit, hit)
dt
group fruit hit
1 apples 1
1 apples 0
1 apples 1
1 oranges 1
2 oranges 0
2 apples 1
2 oranges 1
3 bananas 1
3 bananas 0
3 oranges 0
3 bananas 1
我想使用组中第一次出现的fruit
来过滤组.但是还有另一个条件,我只想保留hit
等于1
的那排水果.
I would like to use the first occurrence of fruit
within a group to filter the groups. But there is another condition, I would only like keep the rows of that fruit where the hit
is equal to 1
.
因此,对于group 1
,apples
是第一次出现,并且具有两次正面命中率,因此我想保留这两行.
So, for group 1
, apples
is the first occurrence, and it has two times a positive hit, thus I I would like to keep those two rows.
结果如下:
group fruit hit
1 apples 1
1 apples 1
2 oranges 1
3 bananas 1
3 bananas 1
我知道您可以使用dplyr
进行过滤,但是我不确定是否可以实现.
I know you can filter with dplyr
but I am not sure I can achieve this.
推荐答案
我们可以使用dplyr
.按分组"分组后,filter
具有"hit"不等于0的行和(&
)作为"fruit"的first
元素的
We can use dplyr
. After grouping by 'group', filter
the rows that have 'hit' not equal to 0 and (&
) the 'fruit' as the first
element of 'fruit'
library(dplyr)
dt %>%
group_by(group) %>%
filter(hit!=0 & fruit == first(fruit))
# group fruit hit
# <fctr> <fctr> <dbl>
#1 1 apples 1
#2 1 apples 1
#3 2 oranges 1
#4 3 bananas 1
#5 3 bananas 1
这篇关于根据多种条件过滤组内的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!