R 中的多列子集 - 更优雅的代码? [英] Subset multiple columns in R - more elegant code?
问题描述
我正在根据多列中的多个条件对数据框进行子集化.我正在选择数据框中的行,这些行包含在三个不同列中的任何一列中的向量标准"中定义的多个值中的任何一个.
I am subsetting a dataframe according to multiple criteria across several columns. I am choosing the rows in the dataframe that contain any one of several values defined in the vector "criteria" in any one of three different columns.
我有一些有效的代码,但想知道还有哪些其他(更优雅?)的方法可以做到这一点.这是我所做的:
I have some code that works, but wonder what other (more elegant?) ways there are to do this. Here is what I've done:
criteria <-c(1:10)
subset1 <-subset(data, data[, "Col1"] %in% criteria | data[, "Col2"]
%in% criteria | data[, "Col3"] %in% criteria)
热烈欢迎建议.(我是 R 初学者,因此也热烈欢迎对您的建议进行非常简单的解释.)
Suggestions warmly welcomed. (I am an R beginner, so very simple explanations about what you are suggesting are also warmly welcomed.)
推荐答案
我不确定您是否需要在此处调用两次 apply
:
I'm not sure if you need two apply
calls here:
# Data
df=data.frame(x=1:4,Col1=c(11,12,3,13),Col2=c(9,12,10,13),Col3=c(9,13,42,23))
criteria=1:10
# Solution
df[apply(df [c('Col1','Col2','Col3')],1,function(x) any(x %in% criteria)),]
除非你想做很多列,那么说可能更易读:
Unless you want to do a lot of columns, then it is probably more readable to say:
subset(df, Col1 %in% criteria | Col2 %in% criteria | Col3 %in% criteria)
这篇关于R 中的多列子集 - 更优雅的代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!