R中的grepl找到与任何字符串列表匹配的匹配项 [英] grepl in R to find matches to any of a list of character strings

查看:1075
本文介绍了R中的grepl找到与任何字符串列表匹配的匹配项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在引用值列表时可能使用 grepl 参数,可能使用%运算符中的%?我想采取下面的数据,如果动物名称中有狗或猫,我想返回一个确定的值,比如说保持。如果它没有狗或猫,我想返回丢弃。

  data < -  data.frame(animal = sample(c(cat,dog,bird, 'doggy','kittycat'),50,replace = T))

现在,如果我只需要通过严格匹配值来实现这一点,比如说猫和狗,我可以使用以下方法:

 匹配< -c(cat,dog)

data $ keep< - ifelse(数据$ animal%in%匹配,Keep,Discard)

但是仅使用grep或grepl指的是列表中的第一个参数:

  data $ keep < -  ifelse(grepl(matches,data $ animal),Keep,Discard)

p>

退货

 警告消息:
在grepl(matches,data $ animal)中:
参数'pattern'的长度大于1,只有第一个元素会被使用

请注意,我在我的搜索中看到了此线索,但这看起来不起作用:
使用具有多种模式的字符向量的grep

解决方案

您可以在 grepl 的正则表达式中使用或( | )语句。



pre $ ifelse(grepl(dog | cat,data $ animal),keep,discard)
#[1]keepkeepdiscardkeepkeepkeepkeepdiscard
#[9]keepkeepkeepkeep保留保留丢弃保留
#[17]丢弃保留保留丢弃保留保留丢弃保留
#[25 ]keepkeepkeepkeepkeepkeepkeepkeep
#[33]keepdiscardkeepdiscardkeep 保持保持
#[41]保持保持保持保持保持保持保持保持
#[49]保持 丢弃

正则表达式 dog | cat 告诉章程ular表达式引擎查找dogcat,并返回两个匹配项。 p>

Is it possible to use a grepl argument when referring to a list of values, maybe using the %in% operator? I want to take the data below and if the animal name has "dog" or "cat" in it, I want to return a certain value, say, "keep"; if it doesn't have "dog" or "cat", I want to return "discard".

data <- data.frame(animal = sample(c("cat","dog","bird", 'doggy','kittycat'), 50, replace = T))

Now, if I were just to do this by strictly matching values, say, "cat" and "dog', I could use the following approach:

matches <- c("cat","dog")

data$keep <- ifelse(data$animal %in% matches, "Keep", "Discard")

But using grep or grepl only refers to the first argument in the list:

data$keep <- ifelse(grepl(matches, data$animal), "Keep","Discard")

returns

Warning message:
In grepl(matches, data$animal) :
  argument 'pattern' has length > 1 and only the first element will be used

Note, I saw this thread in my search, but this doesn't appear to work: grep using a character vector with multiple patterns

解决方案

You can use an "or" (|) statement inside the regular expression of grepl.

ifelse(grepl("dog|cat", data$animal), "keep", "discard")
# [1] "keep"    "keep"    "discard" "keep"    "keep"    "keep"    "keep"    "discard"
# [9] "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "discard" "keep"   
#[17] "discard" "keep"    "keep"    "discard" "keep"    "keep"    "discard" "keep"   
#[25] "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"   
#[33] "keep"    "discard" "keep"    "discard" "keep"    "discard" "keep"    "keep"   
#[41] "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"    "keep"   
#[49] "keep"    "discard"

The regular expression dog|cat tells the regular expression engine to look for either "dog" or "cat", and return the matches for both.

这篇关于R中的grepl找到与任何字符串列表匹配的匹配项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆