模式匹配使用贪婪和不情愿 [英] Pattern matcher using Greedy and Reluctant
问题描述
在java 正则表达式
我读过贪婪和不情愿的量词。他们提到
In java regex
I have read about Greedy and Reluctant Quantifiers. They mentioned as
一个不情愿或非贪婪的量词首先匹配可能的
。因此。*首先匹配任何内容,使整个
字符串无法匹配
A reluctant or "non-greedy" quantifier first matches as little as possible. So the .* matches nothing at first, leaving the entire string unmatched
在此示例中
来源:
yyxxxyxx
模式:。* xx
贪婪量词 *
并产生
0 yyxxxyxx
不情愿的限定符 *?
,我们得到以下内容:
reluctant qualifier *?
, and we get the following:
0 yyxx
4 xyxx
为什么 yxx的结果
, yxx
即使它是可能的最小值也不可能?
Why result of yxx
, yxx
not possible even it is the smallest possible value?
推荐答案
正则表达式引擎返回它找到的第一个和最左边的匹配。
The regex engine returns the first and leftmost match it find as a result.
基本上它尝试匹配从第一个字符开始的模式。如果没有找到相应的匹配,则传输跳入并再次尝试从第二个字符开始,依此类推。
Basically it tries to match the pattern starting from the first character. If it doesn't find a corresponding match, the transmission jumps in and it tries again from the second character, and so on.
如果使用 a +?b
on bab
它将首先尝试从第一个 b
。这不起作用,所以我们尝试从第二个角色。
If you use a+?b
on bab
it will first try from the first b
. That doesn't work, so we try from the second character.
但是在这里它从第一个角色找到一个匹配。从第二个开始甚至没有考虑,我们找到一个匹配,所以我们返回。
But here it finds a match right from the first character. Starting from the second isn't even considered, we found a match so we return.
如果你申请 a +?b
在 aab
上,我们尝试第一个 a
并找到一个整体匹配:故事结束,没有理由尝试其他任何事情。
If you apply a+?b
on aab
, we try at the first a
and find an overall match: end of story, no reason to try anything else.
总结:正则表达式引擎从左向右移动,因此懒惰只会影响右侧长度。
To sum up: the regex engine goes from the left to the right, so laziness can only affect the right side length.
这篇关于模式匹配使用贪婪和不情愿的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!