重新模块非贪婪的匹配破碎 [英] re module non-greedy matches broken

查看:101
本文介绍了重新模块非贪婪的匹配破碎的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

re:

4.2.1正则表达式语法
http://docs.python.org/lib/re-syntax.html


*?,+?,??

添加?在限定符之后,它以非贪婪或

最小化的方式执行比赛;尽可能少的字符将匹配。


正则表达式模块无法执行非贪婪的匹配,因为文档中描述了

:超过尽可能少的字符

匹配。


这是一个错误,需要修复。


例子如下。


lothar @ erda / ntd / vl

$ cat vwre.py

#! / usr / bin / env python

import re


vwre = re.compile(" V。*?W")

vwlre = re.compile(" V。*?WL")


if __name__ ==" __ main __":


newdoc =" V1WVVV2WWW"

vwli = re.findall(vwre,newdoc)

print" vwli [],expect",'''V1W '','''V2W'']

print" vwli [],return",vwli


newdoc =" V1WLV2WV3WV4WLV5WV6WL"

vwlli = re.findall(vwlre,newdoc)

print" vwlli [],expect",''''V1WL'',''V4WL'',''V6WL'' ]

print" vwlli [],return",vwlli


lothar @ erda / ntd / vl

$ python vwre 。$

vwli [],期待[''V1W'',''V2W'']

vwli [],返回[''V1W'','' VVV2W'']

vwlli [],期待['''V1WL'',''V4WL'',''V6WL'']

vwlli [],返回[ '' V1WL '','''V2WV3WV4WL'',''V5WV6WL'']


lothar @ erda / ntd / vl

$ python -V

Python 2.3.3

re:
4.2.1 Regular Expression Syntax
http://docs.python.org/lib/re-syntax.html

*?, +?, ??
Adding "?" after the qualifier makes it perform the match in non-greedy or
minimal fashion; as few characters as possible will be matched.

the regular expression module fails to perform non-greedy matches as
described in the documentation: more than "as few characters as possible"
are matched.

this is a bug and it needs to be fixed.

examples follow.

lothar@erda /ntd/vl
$ cat vwre.py
#! /usr/bin/env python

import re

vwre = re.compile("V.*?W")
vwlre = re.compile("V.*?WL")

if __name__ == "__main__":

newdoc = "V1WVVV2WWW"
vwli = re.findall(vwre, newdoc)
print "vwli[], expect", [''V1W'', ''V2W'']
print "vwli[], return", vwli

newdoc = "V1WLV2WV3WV4WLV5WV6WL"
vwlli = re.findall(vwlre, newdoc)
print "vwlli[], expect", [''V1WL'', ''V4WL'', ''V6WL'']
print "vwlli[], return", vwlli

lothar@erda /ntd/vl
$ python vwre.py
vwli[], expect [''V1W'', ''V2W'']
vwli[], return [''V1W'', ''VVV2W'']
vwlli[], expect [''V1WL'', ''V4WL'', ''V6WL'']
vwlli[], return [''V1WL'', ''V2WV3WV4WL'', ''V5WV6WL'']

lothar@erda /ntd/vl
$ python -V
Python 2.3.3

推荐答案

cat vwre.py

#! / usr / bin / env python

import re


vwre = re.compile(" V。*?W")

vwlre = re.compile(" V。*?WL")


if __name__ ==" __ main __":


newdoc =" V1WVVV2WWW"

vwli = re.findall(vwre,newdoc)

print" vwli [],expect",'''V1W '','''V2W'']

print" vwli [],return",vwli


newdoc =" V1WLV2WV3WV4WLV5WV6WL"

vwlli = re.findall(vwlre,newdoc)

print" vwlli [],expect",''''V1WL'',''V4WL'',''V6WL'' ]

print" vwlli [],return",vwlli


lothar @ erda / ntd / vl
cat vwre.py
#! /usr/bin/env python

import re

vwre = re.compile("V.*?W")
vwlre = re.compile("V.*?WL")

if __name__ == "__main__":

newdoc = "V1WVVV2WWW"
vwli = re.findall(vwre, newdoc)
print "vwli[], expect", [''V1W'', ''V2W'']
print "vwli[], return", vwli

newdoc = "V1WLV2WV3WV4WLV5WV6WL"
vwlli = re.findall(vwlre, newdoc)
print "vwlli[], expect", [''V1WL'', ''V4WL'', ''V6WL'']
print "vwlli[], return", vwlli

lothar@erda /ntd/vl

python vwre.py

vwli [],期待[''V1W'',''V2W'']

vwli [],返回['' V1W'',''VVV2W'']

vwlli [],期待[''V1WL'',''V4WL'',''V6WL'']

vwlli [],返回[''V1WL'',''V2WV3WV4WL'',''V5WV6WL'']


lothar @ erda / ntd / vl
python vwre.py
vwli[], expect [''V1W'', ''V2W'']
vwli[], return [''V1W'', ''VVV2W'']
vwlli[], expect [''V1WL'', ''V4WL'', ''V6WL'']
vwlli[], return [''V1WL'', ''V2WV3WV4WL'', ''V5WV6WL'']

lothar@erda /ntd/vl


python -V

Python 2.3.3
python -V
Python 2.3.3


这篇关于重新模块非贪婪的匹配破碎的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆