在正则表达式匹配重叠 [英] Overlapping matches in Regex

查看:742
本文介绍了在正则表达式匹配重叠的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我似乎无法找到答案这个问题,我想知道,如果存在的话。简单的例子:

I can't seem to find an answer to this problem, and I'm wondering if one exists. Simplified example:

考虑一个字符串NNNN,在这里我想找到的NN所有的比赛 - 而且是那些相互重叠。所以,正则表达式将提供以下3场比赛:

Consider a string "nnnn", where I want to find all matches of "nn" - but also those that overlap with each other. So the regex would provide the following 3 matches:


  1. NN NN

  2. N NN N

  3. NN NN

  1. nnnn
  2. nnnn
  3. nnnn

我知道这不完全是正则表达式都意味着,但走的字符串和解析这个手动似乎是一个可怕的很多code的,考虑到现实中的比赛就必须用一个模式来进行,而不是文字字符串。

I realize this is not exactly what regexes are meant for, but walking the string and parsing this manually seems like an awful lot of code, considering that in reality the matches would have to be done using a pattern, not a literal string.

推荐答案

一个可能的解决办法是使用的背后正面看:

A possible solution could be to use a positive look behind:

(?<=n)n

这会给你的终点位置:

It would give you the end position of:


  1. N N NN
    &NBSP;

  2. N * N *的 N ñ
    &NBSP;

  3. NN * N *的 N

  1. nnnn  
  2. n*n*nn  
  3. nn*n*n


作为mentionned由蒂莫西·扈利中,一个正向前查找更直观

As mentionned by Timothy Khouri, a positive lookahead is more intuitive

我想preFER他主张(= NN?)N 的简单的形式:

I would prefer to his proposition (?=nn)n the simpler form:

(n)(?=(n))

这将引用第一弦您想要的位置并会捕捉组第二n(2)

That would reference the first position of the strings you want and would capture the second n in group(2).

这是因为:


  • 任何有效的常规前pression可以超前内部使用。

  • 如果它包含捕获括号,在反向引用将保存

  • Any valid regular expression can be used inside the lookahead.
  • If it contains capturing parentheses, the backreferences will be saved.

因此​​,组(1),组(2)将捕捉一切'N'重新presents(即使它是一个复杂的正则表达式)。

So group(1) and group(2) will capture whatever 'n' represents (even if it is a complicated regex).

这篇关于在正则表达式匹配重叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆