正则表达式前瞻是否会影响后续匹配? [英] Does regex lookahead affect subsequent match?

查看:121
本文介绍了正则表达式前瞻是否会影响后续匹配?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在玩正则表达式前瞻,遇到了一些我不理解的东西.

I was playing around with regular expression look-aheads and came across something I don't understand.

我希望这个正则表达式:

I would expect this regular expression:

(?=1)x

匹配此字符串:

"x1"

但事实并非如此.在ruby中,代码如下:

But it doesn't. In ruby the code looks like:

> "x1".match /(?=1)x/
=> nil

这就是我期望发生的事情:

Here's what I would expect to happen:

  1. 我们从"x"上的正则表达式解析器的光标开始.
  2. 正则表达式引擎在字符串中搜索"1"并获得匹配项.光标仍位于"x"上
  3. 由于光标没有移动,正则表达式引擎将搜索"x"并找到它.
  4. 成功!利润!

但是我显然错了,因为它不匹配.有人可以告诉我我哪里出问题了吗?

But I'm apparently mistaken, because it's not matching. Could someone please tell me where I've gone wrong?

顺便说一句,我已经注意到,如果前瞻匹配的模式包含我在后续表达式中匹配的字符,则它可以工作. IE. (?=x)x匹配x1就好了.我怀疑这是奥秘的关键,但我只是不明白这一点. :)

Incidentally, I've noticed that if the pattern matched by the lookahead contains the characters I'm matching in the subsequent expression, it works. ie. (?=x)x matches x1 just fine. I suspect this is the key to the mystery, but I'm just not getting it. :)

推荐答案

前瞻不会使正则表达式索引向前移动,它站稳了脚跟",但是它要求在当前位置之后存在或不存在某种模式字符串.

A look-ahead does not move the regex index forward, it "stands its ground", but it requires presence or absence of some pattern after the current position in string.

使用(?=1)x时,您会告诉正则表达式引擎:

When you use (?=1)x, you tell the regex engine:

  1. 下一个字符必须为1
  2. 就在此位置,匹配字符x.
  1. The next character must be 1
  2. Right at this position, match the character x.

这意味着您要求x1,这从不为真/始终为假.此正则表达式永远不会匹配任何内容.

It means you require x to be 1 which is never true/is always false. This regex will never match anything.

这是 regular-expressions.com 中的另一个示例:

让我们将q(?=u)i应用于quit.现在,前瞻为正,并紧随其后的是另一个标记.同样,q匹配q,而u匹配u.同样,必须放弃前瞻的匹配,因此引擎从字符串中的i退回到u.前瞻成功,因此引擎继续执行i.但是i无法匹配u.因此,此匹配尝试失败.所有其余的尝试也会失败,因为字符串中不再有q了.

Let's apply q(?=u)i to quit. The lookahead is now positive and is followed by another token. Again, q matches q and u matches u. Again, the match from the lookahead must be discarded, so the engine steps back from i in the string to u. The lookahead was successful, so the engine continues with i. But i cannot match u. So this match attempt fails. All remaining attempts fail as well, because there are no more q's in the string.

另一个必读资源是 rexegg.com :

向前看和向后看并不意味着向远处遥望.它们的意思是 立即在左侧或右侧查看文本 .如果您想检查一段较远的字符串,则需要在前行中插入双筒望远镜",才能进入要检查的字符串部分,例如.*,或者理想情况下,更具体令牌.

Lookahead and lookbehind don't mean look way ahead into the distance. They mean look at the text immediately to the left or to the right. If you want to inspect a piece of string further down, you will need to insert "binoculars" inside the lookahead to get you to the part of the string you want to inspect—for instance a .*, or, ideally, more specific tokens.

还有

不要期望模式A(?=5)与字符串AB25中的A匹配.许多初学者认为前瞻性提示右边有5",但事实并非如此.在引擎匹配A之后,前瞻(?=5)断言在字符串的当前位置处,紧随其后的是5.如果要检查右侧(任何地方)是否有5,可以使用(?=[^5]*5).

Do not expect the pattern A(?=5) to match the A in the string AB25. Many beginners assume that the lookahead says that "there is a 5 somewhere to the right", but that is not so. After the engine matches the A, the lookahead (?=5) asserts that at the current position in the string, what immediately follows is a 5. If you want to check if there is a 5 somewhere (anywhere) to the right, you can use (?=[^5]*5).

这篇关于正则表达式前瞻是否会影响后续匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆