grep 使用具有多种模式的字符向量 [英] grep using a character vector with multiple patterns

查看:57
本文介绍了grep 使用具有多种模式的字符向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 grep 来测试字符串向量是否存在于另一个向量中,并输出存在的值(匹配模式).

I am trying to use grep to test whether a vector of strings are present in an another vector or not, and to output the values that are present (the matching patterns).

我有一个这样的数据框:

I have a data frame like this:

FirstName Letter   
Alex      A1
Alex      A6
Alex      A7
Bob       A1
Chris     A9
Chris     A6

我在字母"列中有一个字符串模式向量,例如:c("A1", "A9", "A6").

I have a vector of strings patterns to be found in the "Letter" columns, for example: c("A1", "A9", "A6").

我想检查模式向量中的任何字符串是否存在于字母"列中.如果是,我想要唯一值的输出.

I would like to check whether the any of the strings in the pattern vector is present in the "Letter" column. If they are, I would like the output of unique values.

问题是,我不知道如何将 grep 用于多种模式.我试过了:

The problem is, I don't know how to use grep with multiple patterns. I tried:

matches <- unique (
    grep("A1| A9 | A6", myfile$Letter, value=TRUE, fixed=TRUE)
)

但它给了我 0 个不正确的匹配项,有什么建议吗?

But it gives me 0 matches which is not true, any suggestions?

推荐答案

除了@Marek 关于不包括 fixed==TRUE 的评论之外,您还需要在正则表达式中不要有空格.它应该是 "A1|A9|A6".

In addition to @Marek's comment about not including fixed==TRUE, you also need to not have the spaces in your regular expression. It should be "A1|A9|A6".

您还提到有很多模式.假设它们在一个向量中

You also mention that there are lots of patterns. Assuming that they are in a vector

toMatch <- c("A1", "A9", "A6")

然后你可以直接使用 pastecollapse = "|" 来创建你的正则表达式.

Then you can create your regular expression directly using paste and collapse = "|".

matches <- unique (grep(paste(toMatch,collapse="|"), 
                        myfile$Letter, value=TRUE))

这篇关于grep 使用具有多种模式的字符向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆