正则表达式中的重叠匹配 [英] Overlapping matches in Regex

查看:41
本文介绍了正则表达式中的重叠匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我似乎找不到这个问题的答案,我想知道是否存在.简化示例:

I can't seem to find an answer to this problem, and I'm wondering if one exists. Simplified example:

考虑一个字符串nnnn",我想在其中找到nn"的所有匹配项 - 以及那些相互重叠的匹配项.因此正则表达式将提供以下 3 个匹配项:

Consider a string "nnnn", where I want to find all matches of "nn" - but also those that overlap with each other. So the regex would provide the following 3 matches:

  1. nnnn
  2. nnnn
  3. nnnn

我意识到这并不是正则表达式的真正含义,但是遍历字符串并手动解析它似乎是一个非常多的代码,考虑到实际上必须使用模式而不是文字字符串来完成匹配.

I realize this is not exactly what regexes are meant for, but walking the string and parsing this manually seems like an awful lot of code, considering that in reality the matches would have to be done using a pattern, not a literal string.

推荐答案

2016 年更新:

获取nnnnnnSDJMcHattie评论中提出 (?=(nn))(参见 regex101).

To get nn, nn, nn, SDJMcHattie proposes in the comments (?=(nn)) (see regex101).

(?=(nn))


原始答案 (2008)


Original answer (2008)

一个可能的解决方案是使用积极的背后:

A possible solution could be to use a positive look behind:

(?<=n)n

它会给你结束位置:

  1. nnnn
  2. nnnn
  3. nnnn


正如 Timothy Khouri 所提到的,正向预测更直观(参见示例)


As mentioned by Timothy Khouri, a positive lookahead is more intuitive (see example)

我更喜欢他的命题 (?=nn)n 更简单的形式:

I would prefer to his proposition (?=nn)n the simpler form:

(n)(?=(n))

这将引用您想要的字符串的第一个位置并捕获 group(2) 中的第二个 n.

That would reference the first position of the strings you want and would capture the second n in group(2).

这是因为:

  • 任何有效的正则表达式都可以在前瞻中使用.
  • 如果它包含捕获括号,则反向引用将被保存.

因此 group(1) 和 group(2) 将捕获n"代表的任何内容(即使它是一个复杂的正则表达式).

So group(1) and group(2) will capture whatever 'n' represents (even if it is a complicated regex).

这篇关于正则表达式中的重叠匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆