使用OR更好地用dplyr过滤数据框? [英] Better way to filter a data frame with dplyr using OR?
问题描述
我在 R 中有一个数据框,列有 subject1
和 subject2
(其中包含国会图书馆议题)。我想通过测试主题是否与批准的列表匹配来过滤数据框。比方说,我有这个数据框。
data< - data.frame(
pre>
subject1 = c(History,Biology,Physics ,数字人文),
subject2 = c(化学,宗教,化学,宗教)
)
假设这是已批准科目的列表。
条件< - c(历史,宗教)
我想做的是过滤由主题1或主题2:
子集< - 过滤器(data,subject1%in%condition | subject2%in%condition)
根据需要,将原始数据框中的项目1,2和4返回。 >
这是使用或而不是和逻辑过滤多个字段的最佳方法?似乎必须有一个更好,更习惯的方式,但我不知道是什么。
也许一个更通用的方式来问问题是说如果我将subject1和subject2组合在一起,是否有一种方法来测试一个向量中的任何值是否与另一个向量中的任何值相匹配。我想写一些东西:
subset< - filter(data,c(subject1,subject2)%in%条件)
解决方案我不知道这种方法是否更好。至少你不必写列名:
library(dplyr)
filter(data,rowSums (数据,%in%,条件)))
#subject1 subject2
#1历史化学
#2生物宗教
#3数字人文宗教
I have a data frame in R with columns
subject1
andsubject2
(which contain Library of Congress subject headings). I'd like to filter the data frame by testing whether the subjects match an approved list. Say, for example, that I have this data frame.data <- data.frame( subject1 = c("History", "Biology", "Physics", "Digital Humanities"), subject2 = c("Chemistry", "Religion", "Chemistry", "Religion") )
And suppose this is the list of approved subjects.
condition <- c("History", "Religion")
What I want to do is filter by either subject1 or subject2:
subset <- filter(data, subject1 %in% condition | subject2 %in% condition)
That returns items 1, 2, and 4 from the original data frame, as desired.
Is that the best way to filter by multiple fields using or rather than and logic? It seems like there must be a better, more idiomatic way, but I don't know what it is.
Maybe a more generic way to ask the question is to say, if I combine subject1 and subject2, is there a way of testing if any value in one vector matches any value in another vector. I'd like to write something like:
subset <- filter(data, c(subject1, subject2) %in% condition)
解决方案I'm not sure whether this approach is better. At least you don't have to write the column names:
library(dplyr) filter(data, rowSums(sapply(data, "%in%", condition))) # subject1 subject2 # 1 History Chemistry # 2 Biology Religion # 3 Digital Humanities Religion
这篇关于使用OR更好地用dplyr过滤数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!