所有格量词到底如何工作? [英] How exactly does the possessive quantifier work?
问题描述
At the end of the page there is at attempted explanation of how do greedy, reluctant and possessive quantifiers work: http://docs.oracle.com/javase/tutorial/essential/regex/quant.html
但是我尝试了一个例子,但我似乎并没有完全理解它.
However I tried myself an example and I don't seem to understand it fully.
我将直接粘贴结果:
Enter your regex: .*+foo
Enter input string to search: xfooxxxxxxfoo
No match found.
Enter your regex: (.*)+foo
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.
为什么第一个reg.exp.找不到匹配项,第二个匹配吗? 这两个reg.exp.的确切区别是什么?
Why does the first reg.exp. find no match and the second does? What is the exact difference between those 2 reg.exp.?
推荐答案
在另一个量词后的+
表示不允许正则表达式引擎回溯到以前的令牌已匹配的任何内容". (请参见此处所有修饰语的教程.)
The +
after another quantifier means "don't allow the regex engine to backtrack into whatever the previous token has matched". (See a tutorial on possessive quantifiers here).
因此,当您将.*foo
应用于"xfooxxxxxxfoo"
时,.*
首先会匹配整个字符串.然后,由于无法匹配foo
,因此正则表达式引擎会一直回溯,直到可能为止,当.*
匹配"xfooxxxxxx"
并且foo
匹配"foo"
时实现匹配.
So when you apply .*foo
to "xfooxxxxxxfoo"
, the .*
first matches the entire string. Then, since foo
can't be matched, the regex engine backtracks until that's possible, achieving a match when .*
has matched "xfooxxxxxx"
and foo
has matched "foo"
.
现在,附加的+
可以防止发生回溯,因此匹配失败.
Now the additional +
prevents that backtracking from happening, so the match fails.
编写(.*)+foo
时. +
具有完全不同的含义;现在它的意思是前面的一个或多个令牌".顺便说一下,您已经创建了嵌套量词.如果将该正则表达式应用于"xfoxxxxxxxxxfox"
之类的字符串,则会遇到灾难性回溯.
When you write (.*)+foo
. the +
takes on an entirely different meaning; now it means "one or more of the preceding token". You've created nested quantifiers, which is not a good idea, by the way. If you apply that regex to a string like "xfoxxxxxxxxxfox"
, you'll run into catastrophic backtracking.
这篇关于所有格量词到底如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!