正则表达式前瞻是否会影响后续匹配? [英] Does regex lookahead affect subsequent match?
问题描述
我正在玩正则表达式前瞻,遇到了一些我不理解的东西.
I was playing around with regular expression look-aheads and came across something I don't understand.
我希望这个正则表达式:
I would expect this regular expression:
(?=1)x
匹配此字符串:
"x1"
但事实并非如此.在ruby中,代码如下:
But it doesn't. In ruby the code looks like:
> "x1".match /(?=1)x/
=> nil
这就是我期望发生的事情:
Here's what I would expect to happen:
- 我们从"x"上的正则表达式解析器的光标开始.
- 正则表达式引擎在字符串中搜索"1"并获得匹配项.光标仍位于"x"上
- 由于光标没有移动,正则表达式引擎将搜索"x"并找到它.
- 成功!利润!
但是我显然错了,因为它不匹配.有人可以告诉我我哪里出问题了吗?
But I'm apparently mistaken, because it's not matching. Could someone please tell me where I've gone wrong?
顺便说一句,我已经注意到,如果前瞻匹配的模式包含我在后续表达式中匹配的字符,则它可以工作. IE. (?=x)x
匹配x1
就好了.我怀疑这是奥秘的关键,但我只是不明白这一点. :)
Incidentally, I've noticed that if the pattern matched by the lookahead contains the characters I'm matching in the subsequent expression, it works. ie. (?=x)x
matches x1
just fine. I suspect this is the key to the mystery, but I'm just not getting it. :)
推荐答案
前瞻不会使正则表达式索引向前移动,它站稳了脚跟",但是它要求在当前位置之后存在或不存在某种模式字符串.
A look-ahead does not move the regex index forward, it "stands its ground", but it requires presence or absence of some pattern after the current position in string.
使用(?=1)x
时,您会告诉正则表达式引擎:
When you use (?=1)x
, you tell the regex engine:
- 下一个字符必须为
1
- 就在此位置,匹配字符
x
.
- The next character must be
1
- Right at this position, match the character
x
.
这意味着您要求x
为1
,这从不为真/始终为假.此正则表达式永远不会匹配任何内容.
It means you require x
to be 1
which is never true/is always false. This regex will never match anything.
这是 regular-expressions.com 中的另一个示例:
让我们将
q(?=u)i
应用于quit
.现在,前瞻为正,并紧随其后的是另一个标记.同样,q
匹配q
,而u
匹配u
.同样,必须放弃前瞻的匹配,因此引擎从字符串中的i
退回到u
.前瞻成功,因此引擎继续执行i
.但是i
无法匹配u
.因此,此匹配尝试失败.所有其余的尝试也会失败,因为字符串中不再有q
了.
Let's apply
q(?=u)i
toquit
. The lookahead is now positive and is followed by another token. Again,q
matchesq
andu
matchesu
. Again, the match from the lookahead must be discarded, so the engine steps back fromi
in the string tou
. The lookahead was successful, so the engine continues withi
. Buti
cannot matchu
. So this match attempt fails. All remaining attempts fail as well, because there are no moreq
's in the string.
另一个必读资源是 rexegg.com :
向前看和向后看并不意味着向远处遥望.它们的意思是 立即在左侧或右侧查看文本 .如果您想检查一段较远的字符串,则需要在前行中插入双筒望远镜",才能进入要检查的字符串部分,例如
.*
,或者理想情况下,更具体令牌.
Lookahead and lookbehind don't mean look way ahead into the distance. They mean look at the text immediately to the left or to the right. If you want to inspect a piece of string further down, you will need to insert "binoculars" inside the lookahead to get you to the part of the string you want to inspect—for instance a
.*
, or, ideally, more specific tokens.
还有
不要期望模式
A(?=5)
与字符串AB25
中的A
匹配.许多初学者认为前瞻性提示右边有5
",但事实并非如此.在引擎匹配A
之后,前瞻(?=5)
断言在字符串的当前位置处,紧随其后的是5
.如果要检查右侧(任何地方)是否有5
,可以使用(?=[^5]*5)
.
Do not expect the pattern
A(?=5)
to match theA
in the stringAB25
. Many beginners assume that the lookahead says that "there is a5
somewhere to the right", but that is not so. After the engine matches theA
, the lookahead(?=5)
asserts that at the current position in the string, what immediately follows is a5
. If you want to check if there is a5
somewhere (anywhere) to the right, you can use(?=[^5]*5)
.
这篇关于正则表达式前瞻是否会影响后续匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!