R正则表达式查找两个单词相同的字符串,顺序和距离可能会有所不同 [英] R regex to find two words same string, order and distance may vary
问题描述
我想创建一个正则表达式(如果可能)来搜索字符串并确定两个单词是否出现在同一个字符串中.我知道我可以使用两个 grepl
语句(如下所示),但我想使用一个正则表达式来测试这种情况.正则表达式越有效越好.
I want to create a single regex (if possible) to search through strings and determine if two words occur in the same string. I know I can use two grepl
statements (as seen below) but am wanting to use a single regex to test for this condition. The more efficient the regex the better.
我想找到同时包含man"和dog"不区分大小写的字符串.
I want to find strings that contain both "man" and "dog" case insensitive.
x <- c(
"The dog and the man play in the park.",
"The man plays with the dog.",
"That is the man's hat.",
"Man I love that dog!",
"I'm dog tired"
)
## this works but I want a single regex
grepl("dog", x, ignore.case=TRUE) & grepl("man", x, ignore.case=TRUE)
推荐答案
使用正则表达式替换运算符 |
.
Use regex alternation operator |
.
grepl(".*(dog.*man|man.*dog).*", x, ignore.case=TRUE)
如有必要,使用单词边界..
Use word boundaries if necessary..
grepl(".*(\\bdog\\b.*\\bman\\b|\\bman\\b.*\\bdog\\b).*", x, ignore.case=TRUE)
不需要前导和尾随 .*
grepl("(dog.*man|man.*dog)", x, ignore.case=TRUE)
您可以在正则表达式中提供不区分大小写的修饰符.
You may give the case-insensitive modifier within the regex itself.
grepl("(?i)(dog.*man|man.*dog)", x)
这篇关于R正则表达式查找两个单词相同的字符串,顺序和距离可能会有所不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!