重新模块非贪婪的匹配破碎 [英] re module non-greedy matches broken
问题描述
re:
4.2.1正则表达式语法
http://docs.python.org/lib/re-syntax.html
*?,+?,??
添加?在限定符之后,它以非贪婪或
最小化的方式执行比赛;尽可能少的字符将匹配。
正则表达式模块无法执行非贪婪的匹配,因为文档中描述了
:超过尽可能少的字符
匹配。
这是一个错误,需要修复。
例子如下。
lothar @ erda / ntd / vl
$ cat vwre.py
#! / usr / bin / env python
import re
vwre = re.compile(" V。*?W")
vwlre = re.compile(" V。*?WL")
if __name__ ==" __ main __":
newdoc =" V1WVVV2WWW"
vwli = re.findall(vwre,newdoc)
print" vwli [],expect",'''V1W '','''V2W'']
print" vwli [],return",vwli
newdoc =" V1WLV2WV3WV4WLV5WV6WL"
vwlli = re.findall(vwlre,newdoc)
print" vwlli [],expect",''''V1WL'',''V4WL'',''V6WL'' ]
print" vwlli [],return",vwlli
lothar @ erda / ntd / vl
$ python vwre 。$
vwli [],期待[''V1W'',''V2W'']
vwli [],返回[''V1W'','' VVV2W'']
vwlli [],期待['''V1WL'',''V4WL'',''V6WL'']
vwlli [],返回[ '' V1WL '','''V2WV3WV4WL'',''V5WV6WL'']
lothar @ erda / ntd / vl
$ python -V
Python 2.3.3
re:
4.2.1 Regular Expression Syntax
http://docs.python.org/lib/re-syntax.html
*?, +?, ??
Adding "?" after the qualifier makes it perform the match in non-greedy or
minimal fashion; as few characters as possible will be matched.
the regular expression module fails to perform non-greedy matches as
described in the documentation: more than "as few characters as possible"
are matched.
this is a bug and it needs to be fixed.
examples follow.
lothar@erda /ntd/vl
$ cat vwre.py
#! /usr/bin/env python
import re
vwre = re.compile("V.*?W")
vwlre = re.compile("V.*?WL")
if __name__ == "__main__":
newdoc = "V1WVVV2WWW"
vwli = re.findall(vwre, newdoc)
print "vwli[], expect", [''V1W'', ''V2W'']
print "vwli[], return", vwli
newdoc = "V1WLV2WV3WV4WLV5WV6WL"
vwlli = re.findall(vwlre, newdoc)
print "vwlli[], expect", [''V1WL'', ''V4WL'', ''V6WL'']
print "vwlli[], return", vwlli
lothar@erda /ntd/vl
$ python vwre.py
vwli[], expect [''V1W'', ''V2W'']
vwli[], return [''V1W'', ''VVV2W'']
vwlli[], expect [''V1WL'', ''V4WL'', ''V6WL'']
vwlli[], return [''V1WL'', ''V2WV3WV4WL'', ''V5WV6WL'']
lothar@erda /ntd/vl
$ python -V
Python 2.3.3
推荐答案
cat vwre.py
#! / usr / bin / env python
import re
vwre = re.compile(" V。*?W")
vwlre = re.compile(" V。*?WL")
if __name__ ==" __ main __":
newdoc =" V1WVVV2WWW"
vwli = re.findall(vwre,newdoc)
print" vwli [],expect",'''V1W '','''V2W'']
print" vwli [],return",vwli
newdoc =" V1WLV2WV3WV4WLV5WV6WL"
vwlli = re.findall(vwlre,newdoc)
print" vwlli [],expect",''''V1WL'',''V4WL'',''V6WL'' ]
print" vwlli [],return",vwlli
lothar @ erda / ntd / vl
cat vwre.py
#! /usr/bin/env python
import re
vwre = re.compile("V.*?W")
vwlre = re.compile("V.*?WL")
if __name__ == "__main__":
newdoc = "V1WVVV2WWW"
vwli = re.findall(vwre, newdoc)
print "vwli[], expect", [''V1W'', ''V2W'']
print "vwli[], return", vwli
newdoc = "V1WLV2WV3WV4WLV5WV6WL"
vwlli = re.findall(vwlre, newdoc)
print "vwlli[], expect", [''V1WL'', ''V4WL'', ''V6WL'']
print "vwlli[], return", vwlli
lothar@erda /ntd/vl
>
python vwre.py
vwli [],期待[''V1W'',''V2W'']
vwli [],返回['' V1W'',''VVV2W'']
vwlli [],期待[''V1WL'',''V4WL'',''V6WL'']
vwlli [],返回[''V1WL'',''V2WV3WV4WL'',''V5WV6WL'']
lothar @ erda / ntd / vl
python vwre.py
vwli[], expect [''V1W'', ''V2W'']
vwli[], return [''V1W'', ''VVV2W'']
vwlli[], expect [''V1WL'', ''V4WL'', ''V6WL'']
vwlli[], return [''V1WL'', ''V2WV3WV4WL'', ''V5WV6WL'']
lothar@erda /ntd/vl
python -V
Python 2.3.3
python -V
Python 2.3.3
这篇关于重新模块非贪婪的匹配破碎的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!