R正则表达式查找两个单词相同的字符串,顺序和距离可能会有所不同 [英] R regex to find two words same string, order and distance may vary

查看:30
本文介绍了R正则表达式查找两个单词相同的字符串,顺序和距离可能会有所不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个正则表达式(如果可能)来搜索字符串并确定两个单词是否出现在同一个字符串中.我知道我可以使用两个 grepl 语句(如下所示),但我想使用一个正则表达式来测试这种情况.正则表达式越有效越好.

I want to create a single regex (if possible) to search through strings and determine if two words occur in the same string. I know I can use two grepl statements (as seen below) but am wanting to use a single regex to test for this condition. The more efficient the regex the better.

我想找到同时包含man"和dog"不区分大小写的字符串.

I want to find strings that contain both "man" and "dog" case insensitive.

x <- c(
    "The dog and the man play in the park.",
    "The man plays with the dog.",
    "That is the man's hat.",
    "Man I love that dog!",
    "I'm dog tired"
)

## this works but I want a single regex
grepl("dog", x, ignore.case=TRUE)  & grepl("man", x, ignore.case=TRUE) 

推荐答案

使用正则表达式替换运算符 |.

Use regex alternation operator |.

grepl(".*(dog.*man|man.*dog).*", x, ignore.case=TRUE)

如有必要,使用单词边界..

Use word boundaries if necessary..

grepl(".*(\\bdog\\b.*\\bman\\b|\\bman\\b.*\\bdog\\b).*", x, ignore.case=TRUE)

不需要前导和尾随 .*

grepl("(dog.*man|man.*dog)", x, ignore.case=TRUE)

您可以在正则表达式中提供不区分大小写的修饰符.

You may give the case-insensitive modifier within the regex itself.

grepl("(?i)(dog.*man|man.*dog)", x)

这篇关于R正则表达式查找两个单词相同的字符串,顺序和距离可能会有所不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆