正则表达式匹配子字符串,除非另一个子字符串匹配 [英] regex match substring unless another substring matches

查看:39
本文介绍了正则表达式匹配子字符串,除非另一个子字符串匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试更深入地研究正则表达式并希望匹配条件,除非在同一字符串中也找到了某些子字符串.我知道我可以使用两个 grepl 语句(如下所示),但我想使用一个正则表达式来测试这种情况,因为我正在推动我的理解.假设我想使用 "(dog.*man|man.*dog)" (取自此处) 但如果字符串包含子字符串park",则不会.我想我可以使用 (*SKIP)(*FAIL) 来否定park",但这不会导致字符串失败(如下所示).

I'm trying to dig deeper into regexes and want to match a condition unless some substring is also found in the same string. I know I can use two grepl statements (as seen below) but am wanting to use a single regex to test for this condition as I'm pushing my understanding. Let's say I want to match the words "dog" and "man" using "(dog.*man|man.*dog)" (taken from here) but not if the string contains the substring "park". I figured I could use (*SKIP)(*FAIL) to negate the "park" but this does not cause the string to fail (shown below).

  • 如何匹配finddog"&的逻辑带有 1 个正则表达式的人"而不是公园"?
  • 我对(*SKIP)(*FAIL)|的理解有什么问题?

代码:

x <- c(
    "The dog and the man play in the park.",
    "The man plays with the dog.",
    "That is the man's hat.",
    "Man I love that dog!",
    "I'm dog tired",
    "The dog park is no place for man.",
    "Park next to this dog's man."
)

# Could do this but want one regex
grepl("(dog.*man|man.*dog)", x, ignore.case=TRUE) & !grepl("park", x, ignore.case=TRUE)

# Thought this would work, it does not
grepl("park(*SKIP)(*FAIL)|(dog.*man|man.*dog)", x, ignore.case=TRUE, perl=TRUE)

推荐答案

您可以使用锚定前瞻解决方案(需要 Perl 风格的正则表达式):

You can use the anchored look-ahead solution (requiring Perl-style regexp):

grepl("^(?!.*park)(?=.*dog.*man|.*man.*dog)", x, ignore.case=TRUE, perl=T)

这是一个 IDEONE 演示

  • ^ - 将模式锚定在字符串的开头
  • (?!.*park) - 如果 park 存在,则匹配失败
  • (?=.*dog.*man|.*man.*dog) - 如果 mandog 匹配失败缺席.
  • ^ - anchors the pattern at the start of the string
  • (?!.*park) - fail the match if park is present
  • (?=.*dog.*man|.*man.*dog) - fail the match if man and dog are absent.

具有 3 个前瞻的另一个版本(更具可扩展性):

Another version (more scalable) with 3 look-aheads:

^(?!.*park)(?=.*dog)(?=.*man)

这篇关于正则表达式匹配子字符串,除非另一个子字符串匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆