正则表达式匹配超过预期 [英] Regular expression matches more than expected
问题描述
给定的是以下python脚本:
Given is the following python script:
text = '<?xml version="1.24" encoding="utf-8">'
mu = (".??[?]?[?]", "....")
for item in mu:
print item,":",re.search(item, text).group()
有人可以解释一下为什么第一次使用正则表达式 .??[?]?[?]
返回 <?
而不是 ?代码>.
Can someone please explain why the first hit with the regex .??[?]?[?]
returns <?
instead of just ?
.
我的解释:
.??
应该不匹配任何内容,因为.?
可以匹配或不匹配任何字符,第二个?
使它不贪婪.立>[?]?
能不能匹配?
,所以也没什么好[?]
只匹配?
.??
should match nothing as.?
can match or not any char and the second?
makes it not greedy.[?]?
can match?
or not, so nothing is good, too[?]
just matches?
这应该导致 ?
而不是 <?
That should result in ?
and not in <?
推荐答案
出于同样的原因 o*?bar
匹配 foobar
中的 oobar
.即使量词是非贪婪的,正则表达式也会尝试以所有可能的方式从第一个字符开始匹配,然后再继续下一个.
For the same reason o*?bar
matches oobar
in foobar
. Even if the quantifier is non-greedy the regex will try to match from the first char in all possible ways, before moving on to the next.
首先 .??
匹配一个空字符串,但是当正则表达式引擎回溯到它时,它匹配 <
,从而使正则表达式的其余部分匹配,不将匹配的开始位置移动到下一个字符.
First the .??
matches an empty string, but when the regex engine backtracks to it, it matches <
, thus making the rest of the regex match, without moving the start position of the match to the next character.
这篇关于正则表达式匹配超过预期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!