首页
Python
正则表达式匹配超过预期

正则表达式匹配超过预期 [英] Regular expression matches more than expected

查看：29 发布时间：2021/6/14 20:15:52 python regex pattern-matching

本文介绍了正则表达式匹配超过预期的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给定的是以下python脚本:

Given is the following python script:

text = '<?xml version="1.24" encoding="utf-8">'
mu = (".??[?]?[?]", "....")
for item in mu:
    print item,":",re.search(item, text).group()

有人可以解释一下为什么第一次使用正则表达式 .??[?]?[?] 返回 <? 而不是 ?.


Can someone please explain why the first hit with the regex .??[?]?[?] returns <? instead of just ?.
我的解释:

.?? 应该不匹配任何内容，因为 .? 可以匹配或不匹配任何字符，第二个 ? 使它不贪婪.立>
[?]? 能不能匹配?，所以也没什么好
[?] 只匹配 ?

.?? should match nothing as .? can match or not any char and the second ? makes it not greedy.
[?]? can match ? or not, so nothing is good, too
[?] just matches ?

这应该导致 ? 而不是 <?

That should result in ? and not in <?

推荐答案

出于同样的原因 o*?bar 匹配 foobar 中的 oobar.即使量词是非贪婪的，正则表达式也会尝试以所有可能的方式从第一个字符开始匹配，然后再继续下一个.

For the same reason o*?bar matches oobar in foobar. Even if the quantifier is non-greedy the regex will try to match from the first char in all possible ways, before moving on to the next.

首先 .?? 匹配一个空字符串，但是当正则表达式引擎回溯到它时，它匹配 <，从而使正则表达式的其余部分匹配，不将匹配的开始位置移动到下一个字符.

First the .?? matches an empty string, but when the regex engine backtracks to it, it matches <, thus making the rest of the regex match, without moving the start position of the match to the next character.

这篇关于正则表达式匹配超过预期的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

登录关闭

扫码关注1秒登录

发送“验证码”获取 | 15天全站免登陆