过滤多个条件dplyr [英] Filter multiple conditions dplyr
问题描述
我有一个 data.frame
,其中一列有字符数据。
我想从同一列中的 data.frame
中过滤多个选项。有没有一个简单的方法来做到这一点我失踪了?
I have a data.frame
with character data in one of the columns.
I would like to filter multiple options in the data.frame
from the same column. Is there an easy way to do this that I'm missing?
示例:
数据。框架
name = dat
Example:
data.frame
name = dat
days name
88 Lynn
11 Tom
2 Chris
5 Lisa
22 Kyla
1 Tom
222 Lynn
2 Lynn
我想过滤掉 Tom
code> Lynn 例如。
当我这样做:
I'd like to filter out Tom
and Lynn
for example.
When I do:
target <- c("Tom", "Lynn")
filt <- filter(dat, name == target)
我收到此错误:
longer object length is not a multiple of shorter object length
推荐答案
您需要% %
而不是 ==
:
library(dplyr)
target <- c("Tom", "Lynn")
filter(dat, name %in% target) # equivalently, dat %>% filter(name %in% target)
生产
days name
1 88 Lynn
2 11 Tom
3 1 Tom
4 222 Lynn
5 2 Lynn
要了解为什么,请考虑以下情况:
To understand why, consider what happens here:
dat$name == target
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
基本上,我们回收两个长度目标
向量四次,以匹配 dat的长度$名称
。换句话说,我们在做:
Basically, we're recycling the two length target
vector four times to match the length of dat$name
. In other words, we are doing:
Lynn == Tom
Tom == Lynn
Chris == Tom
Lisa == Lynn
... continue repeating Tom and Lynn until end of data frame
在这种情况下,我们没有收到错误,因为我怀疑你的数据框实际上有不同数量的行,不允许回收,但你提供的样本(8行)。如果样本有奇数行,我会得到与您相同的错误。但即使回收工作,这显然不是你想要的。基本上,语句 dat $ name == target
相当于说:
In this case we don't get an error because I suspect your data frame actually has a different number of rows that don't allow recycling, but the sample you provide does (8 rows). If the sample had had an odd number of rows I would have gotten the same error as you. But even when recycling works, this is clearly not what you want. Basically, the statement dat$name == target
is equivalent to saying:
对于等于Tom的每个奇数值或等于Lynn的每个偶数值,返回
TRUE
。
这样,你的样本数据框中的最后一个值就是等于Lynn,因此一个 TRUE
It so happens that the last value in your sample data frame is even and equal to "Lynn", hence the one TRUE
above.
相比之下,%target 中的 dat $ name%说:
To contrast, dat$name %in% target
says:
,检查它是否存在于
target
。
非常不同。这是结果:
[1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE
请注意,您的问题与 dplyr
无关,只是错误使用 ==
。
Note your problem has nothing to do with dplyr
, just the mis-use of ==
.
这篇关于过滤多个条件dplyr的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!