grep 使用具有多种模式的字符向量 [英] grep using a character vector with multiple patterns
问题描述
我正在尝试使用 grep
来测试字符串向量是否存在于另一个向量中,并输出存在的值(匹配模式).
I am trying to use grep
to test whether a vector of strings are present in an another vector or not, and to output the values that are present (the matching patterns).
我有一个这样的数据框:
I have a data frame like this:
FirstName Letter
Alex A1
Alex A6
Alex A7
Bob A1
Chris A9
Chris A6
我在字母"列中有一个字符串模式向量,例如:c("A1", "A9", "A6")
.
I have a vector of strings patterns to be found in the "Letter" columns, for example: c("A1", "A9", "A6")
.
我想检查模式向量中的任何字符串是否存在于字母"列中.如果是,我想要唯一值的输出.
I would like to check whether the any of the strings in the pattern vector is present in the "Letter" column. If they are, I would like the output of unique values.
问题是,我不知道如何将 grep
用于多种模式.我试过了:
The problem is, I don't know how to use grep
with multiple patterns. I tried:
matches <- unique (
grep("A1| A9 | A6", myfile$Letter, value=TRUE, fixed=TRUE)
)
但它给了我 0 个不正确的匹配项,有什么建议吗?
But it gives me 0 matches which is not true, any suggestions?
推荐答案
除了@Marek 关于不包括 fixed==TRUE
的评论之外,您还需要在正则表达式中不要有空格.它应该是 "A1|A9|A6"
.
In addition to @Marek's comment about not including fixed==TRUE
, you also need to not have the spaces in your regular expression. It should be "A1|A9|A6"
.
您还提到有很多模式.假设它们在一个向量中
You also mention that there are lots of patterns. Assuming that they are in a vector
toMatch <- c("A1", "A9", "A6")
然后你可以直接使用 paste
和 collapse = "|"
来创建你的正则表达式.
Then you can create your regular expression directly using paste
and collapse = "|"
.
matches <- unique (grep(paste(toMatch,collapse="|"),
myfile$Letter, value=TRUE))
这篇关于grep 使用具有多种模式的字符向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!