如果任何列与一组值中的一个匹配,则保留行 [英] Keeping rows if any column matches one of a set of values
问题描述
我有一个关于使用 R 进行子集化的简单问题;我想我很接近,但不能完全理解.基本上,我有 25 个感兴趣的列和大约 100 个值.我想保留其中一列中包含任何这些值的任何行.简单例子:
I have a simple question about subsetting using R; I think I am close but can't quite get it. Basically, I have 25 columns of interest and about 100 values. Any row that has ANY of those values in at one of the columns, I want to keep. Simple example:
Values <- c(1,2,5)
col1 <- c(2,6,8,1,3,5)
col2 <- c(1,4,5,9,0,0)
col3 <- c('dog', 'cat', 'cat', 'pig', 'chicken', 'cat')
df <- cbind.data.frame(col1, col2, col3)
df1 <- subset(df, col1%in%Values)
(请注意,第三列表示还有其他列,但我不需要将这些值与这些值匹配;保留的行仅取决于第 1 列和第 2 列).我知道在这个微不足道的情况下,我可以添加
(Note that the third column is to indicate that there are additional columns but I don't need to match the values to those; the rows retained only depend upon columns 1 and 2). I know that in this trivial case I could just add
| col2%in%Values
从第 2 列中获取额外的行,但对于 25 列,我不想为每一行添加 OR 语句.我试过了
to get the additional rows from column 2, but with 25 columns I don't want to add an OR statement for every single one. I tried
file2011_test <- file2011[file2011[,9:33]%in%CO_codes] #real names of values
但是没有用.(是的,我知道这是混合子集类型;我发现 subset() 更容易理解,但我认为它不能帮助我满足我的需要?)
but it didn't work. (And yes I know this is mixing subsetting types; I find subset() easier to understand but I don't think it can help me with what I need?)
推荐答案
也许你可以试试:
df[Reduce(`|`, lapply(as.data.frame(df), function(x) x %in% Values)),]
# col1 col2
#[1,] 2 1
#[2,] 8 5
#[3,] 1 9
#[4,] 5 0
或
indx <- df %in% Values
dim(indx) <- dim(df)
df[!!rowSums(indx),]
# col1 col2
# [1,] 2 1
# [2,] 8 5
# [3,] 1 9
# [4,] 5 0
更新
使用新数据集
df[Reduce(`|`, lapply(df[sapply(df, is.numeric)], function(x) x %in% Values)),]
# col1 col2 col3
#1 2 1 dog
#3 8 5 cat
#4 1 9 pig
#6 5 0 cat
这篇关于如果任何列与一组值中的一个匹配,则保留行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!