正则表达式 (RegEx) 和 dplyr::filter() [英] Regular expressions (RegEx) and dplyr::filter()
问题描述
我有一个简单的数据框,如下所示:
I have a simple data frame that looks like this:
x <- c("aa", "aa", "aa", "bb", "cc", "cc", "cc")
y <- c(101, 102, 113, 201, 202, 344, 407)
df = data.frame(x, y)
x y
1 aa 101
2 aa 102
3 aa 113
4 bb 201
5 cc 202
6 cc 344
7 cc 407
我想使用 dplyr::filter() 和 RegEx 来过滤掉所有以数字 1
I would like to use a dplyr::filter() and a RegEx to filter out all the y
observations that start with the number 1
我想象代码看起来像这样:
I'm imagining that the code will look something like this:
df %>%
filter(y != grep("^1"))
但是我收到一个 Error in grep("^1") :缺少参数x",没有默认值
推荐答案
您需要仔细检查 grepl
和 filter
的文档.
You need to double check the documentations for grepl
and filter
.
对于 grep
/grepl
,您还必须提供要检入的向量(在本例中为 y)并且 filter
需要一个逻辑向量(即你需要使用 grepl
).如果你想提供一个索引向量(来自 grep
),你可以使用 slice
代替.
For grep
/grepl
you have to also supply the vector that you want to check in (y in this case) and filter
takes a logical vector (i.e. you need to use grepl
). If you want to supply an index vector (from grep
) you can use slice
instead.
df %>% filter(!grepl("^1", y))
或者使用从 grep
派生的索引:
Or with an index derived from grep
:
df %>% slice(grep("^1", y, invert = TRUE))
但是你也可以只使用 substr
因为你只对第一个字符感兴趣:
But you can also just use substr
because you are only interested in the first character:
df %>% filter(substr(y, 1, 1) != 1)
这篇关于正则表达式 (RegEx) 和 dplyr::filter()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!