按组保留行,直到在列中首次出现值为止。没有价值的团体 [英] Retain rows up to first occurrence of a value in a column, by group. Groups without value allowed
问题描述
我有一个像这样的数据框:
I have a data frame like this one:
> df
id type
1 1 a
2 1 a
3 1 b
4 1 a
5 1 b
6 2 a
7 2 a
8 2 b
9 3 a
10 3 a
我想保留每个组(id)的所有行,直到类型列中首次出现值'b'。对于没有类型'b'的组,我想保留所有行。
I want to keep all rows for each group (id) up to the first occurrence of value 'b' in the type column. For groups without type 'b', I want to keep all their rows.
结果数据框应如下所示:
The resulting data frame should look like this:
> dfnew
id type
1 1 a
2 1 a
3 1 b
4 2 a
5 2 a
6 2 b
7 3 a
8 3 a
我尝试了以下代码,但是保留超出第一次出现的 b的值 a的其他行,并且仅排除 b的其他出现,这不是我想要的。请看下面的第4行。我想摆脱它。
I tried the following code, but it retains additional rows that have the value 'a' beyond the first occurrence of 'b', and only excludes additional occurrences of 'b', which is not what I want. Look at row 4 in the following. I want to rid of it.
> df %>% group_by(id) %>% filter(cumsum(type == 'b') <= 1)
Source: local data frame [7 x 2]
Groups: id
id type
1 1 a
2 1 a
3 1 b
4 1 a
5 2 a
6 2 a
7 2 b
8 3 a
9 3 a
推荐答案
您可以将匹配
或其中
与
切片
或(如@Richard所述) which.max
You could combine match
or which
with slice
or (as mentioned by @Richard) which.max
library(dplyr)
df %>%
group_by(id) %>%
slice(if(any(type == "b")) 1:which.max(type == "b") else row_number())
# Source: local data table [8 x 2]
# Groups: id
#
# id type
# 1 1 a
# 2 1 a
# 3 1 b
# 4 2 a
# 5 2 a
# 6 2 b
# 7 3 a
# 8 3 a
或者您可以尝试使用 data.table
library(data.table)
setDT(df)[, if(any(type == "b")) .SD[1:which.max(type == "b")] else .SD, by = id]
# id type
# 1: 1 a
# 2: 1 a
# 3: 1 b
# 4: 2 a
# 5: 2 a
# 6: 2 b
# 7: 3 a
# 8: 3 a
这篇关于按组保留行,直到在列中首次出现值为止。没有价值的团体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!