多个条件的子集 [英] Subset by multiple conditions
问题描述
也许这是一些基本的东西,但我找不到答案.
我有
Maybe it's something basic, but I couldn't find the answer.
I have
Id Year V1
1 2009 33
1 2010 67
1 2011 38
2 2009 45
3 2009 65
3 2010 74
4 2009 47
4 2010 51
4 2011 14
我只需要选择具有相同 ID 但在 2009、2010 和 2011 三年内的行.
I need to select only the rows that have the same Id but it´s in the three years 2009, 2010 and 2011.
Id Year V1
1 2009 33
1 2010 67
1 2011 38
4 2009 47
4 2010 51
4 2011 14
我试试
d1_3 <- subset(d1, Year==2009 |Year==2010 |Year==2011 )
但它不起作用.
谁能提供一些建议,让我如何在 R 中做到这一点?
Can anyone provide some suggestions that how I can do this in R?
推荐答案
我认为 ave
在这里很有用.我将您的原始数据框称为df".对于每个 Id,检查年份中是否存在 2009-2011 (2009:2011 %in% x
).这给出了一个逻辑向量,它可以被sum
med.测试总和是否等于 3(如果所有年份都存在,则总和为 3),这会产生一个新的逻辑向量,用于对数据框的行进行子集.
I think ave
could be useful here. I call your original data frame 'df'. For each Id, check if 2009-2011 is present in Year (2009:2011 %in% x
). This gives a logical vector, which can be sum
med. Test if the sum equals 3 (if all Years are present, the sum is 3), which results in a new logical vector, which is used to subset rows of the data frame.
df[ave(df$Year, df$Id, FUN = function(x) sum(2009:2011 %in% x) == 3, ]
# Id Year V1
# 1 1 2009 33
# 2 1 2010 67
# 3 1 2011 38
# 7 4 2009 47
# 8 4 2010 51
# 9 4 2011 14
这篇关于多个条件的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!