正则表达式异常 [英] Regex anomaly

查看:69
本文介绍了正则表达式异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



你好,


有没有人对re.I编译re'的问题(忽略

case)旗帜?我不能理解这个编译后产生一个

不同的匹配,当给出标志时,奇怪它们与

的区别是未编译的正则表达式(如我认为未编译的api是一个包装器

围绕编译执行块)它与

编译版本的区别在于没有指定标志。给出的匹配是完全的

废话。


在[48]中:import re

在[49]中:reStr = r"([az] +)://"

In [51]:against =" http://www.hello.com"

在[53]中:re.match(reStr,反对).groups()

Out [53] :(''http'',)

In [54 ]:re.match(reStr,against,re.I).groups()

Out [54] :(''http'',)

In [55 ]:reCompiled = re.compile(reStr)

在[56]中:reCompiled.match(反对).groups()

Out [56] :(''http '',)

在[57]中:reCompiled.match(反对,re.I).groups()

Out [57] :(''tp'' ,)


欢呼,

-Mike

解决方案

< MI ******** @ gmail.com>写道:


你好,

有没有人对re.I编译re'的问题(忽略
的情况)国旗?我不能理解这个编译后产生一个不同的匹配,当给出标志时,奇怪的是它与未编译的正则表达式的区别(因为我认为未编译的api是一个包装器
围绕编译和执行块)它与
编译版本的区别在于没有指定标志。给出输入重新给出的匹配是完全无意义的。

在[48]中:import re
在[49]中:reStr = r"([az] +): //
在[51]中:反对=" http://www.hello.com"
在[53]中:re.match(reStr,against).groups()
Out [53] :(''http'',)
在[54]中:re.match(reStr,against,re.I).groups()
Out [54] :( ''http'',)
在[55]中:reCompiled = re.compile(reStr)
在[56]中:reCompiled.match(反对).groups()
Out [56 ] :(''http'',)
在[57]中:reCompiled.match(反对,re.I).groups()
Out [57] :(''tp'',)




大声笑,当你看到问题时你也会LOL :-)


你可以''将re.I标志提供给reCompiled.match()。你必须给re.compile()提供

。 reCompiled.match()的第二个参数是

位置开始搜索的位置。我正在猜测。我被定义为2,

这解释了你得到的匹配。


这实际上是鸭子打字的地方之一让我们失望。

如果我们有类型束缚,re.I将是RegExFlags的一个实例,并且

reCompiled.match()会在第二个时抛出一个TypeError

参数不是整数。我并不是说类型束缚本身比鸭子打字更好,只是因为它有时会带来好处。


1月2日2006 21:00:53 -0800, mi********@gmail.com < mi********@gmail.com>写道:


有没有人有编译re'与re.I(忽略
情况)标志的问题?我不能理解这个编译后产生一个不同的匹配,当给出标志时,奇怪的是它与未编译的正则表达式的区别(因为我认为未编译的api是一个包装器
围绕编译和执行块)它与
编译版本的区别在于没有指定标志。给出输入重新给出的匹配是完全无意义的。




re.compile和re.match方法采用flag参数:


编译(模式[,标志])

匹配(模式,字符串[,标志])


但正则表达式对象方法需要不同的参数:


匹配(字符串[,pos [,endpos]])


这不是有点混乱re.match()和

re.compile()。match()的参数是如此不同,但这就是'你b $ b $'的原因你需要这样做:


reCompiled = re.compile(reStr,re.I)

reCompiled.match(反对).groups()


来获得你想要的行为。


Andrew


>>>>> mike klaas< mi ******** @ gmail.com>写道:

在[48]中:import re
在[49]中:reStr = r"([az] +)://"
In [51]:against =" http://www.hello.com"
在[53]中:re.match(reStr,against).groups()
Out [53] :(' 'http'',)
在[54]中:re.match(reStr,against,re.I).groups()
Out [54] :(''http'',)在[56]中:reCompiled.match(反对).groups()
Out [56] :(''http'', )
在[57]中:reCompiled.match(反对,re.I).groups()
Out [57] :(''tp'',)




我可以在Debian Linux测试中重现这一点,python 2.3和python

2.4。好像是一个bug。 search()也表现出相同的行为。


Ganesan

-

Ganesan Rajagopal(rganesan at debian.org)| GPG密钥:1024D / 5D8C12EA

网址: http://employees.org / ~rganesan | http://rganesan.blogspot.com


Hello,

Has anyone has issue with compiled re''s vis-a-vis the re.I (ignore
case) flag? I can''t make sense of this compiled re producing a
different match when given the flag, odd both in it''s difference from
the uncompiled regex (as I thought the uncompiled api was a wrapper
around a compile-and-execute block) and it''s difference from the
compiled version with no flag specified. The match given is utter
nonsense given the input re.

In [48]: import re
In [49]: reStr = r"([a-z]+)://"
In [51]: against = "http://www.hello.com"
In [53]: re.match(reStr, against).groups()
Out[53]: (''http'',)
In [54]: re.match(reStr, against, re.I).groups()
Out[54]: (''http'',)
In [55]: reCompiled = re.compile(reStr)
In [56]: reCompiled.match(against).groups()
Out[56]: (''http'',)
In [57]: reCompiled.match(against, re.I).groups()
Out[57]: (''tp'',)

cheers,
-Mike

解决方案

<mi********@gmail.com> wrote:


Hello,

Has anyone has issue with compiled re''s vis-a-vis the re.I (ignore
case) flag? I can''t make sense of this compiled re producing a
different match when given the flag, odd both in it''s difference from
the uncompiled regex (as I thought the uncompiled api was a wrapper
around a compile-and-execute block) and it''s difference from the
compiled version with no flag specified. The match given is utter
nonsense given the input re.

In [48]: import re
In [49]: reStr = r"([a-z]+)://"
In [51]: against = "http://www.hello.com"
In [53]: re.match(reStr, against).groups()
Out[53]: (''http'',)
In [54]: re.match(reStr, against, re.I).groups()
Out[54]: (''http'',)
In [55]: reCompiled = re.compile(reStr)
In [56]: reCompiled.match(against).groups()
Out[56]: (''http'',)
In [57]: reCompiled.match(against, re.I).groups()
Out[57]: (''tp'',)



LOL, and you''ll be LOL too when you see the problem :-)

You can''t give the re.I flag to reCompiled.match(). You have to give
it to re.compile(). The second argument to reCompiled.match() is the
position where to start searching. I''m guessing re.I is defined as 2,
which explains the match you got.

This is actually one of those places where duck typing let us down.
If we had type bondage, re.I would be an instance of RegExFlags, and
reCompiled.match() would have thrown a TypeError when the second
argument wasn''t an integer. I''m not saying type bondage is inherently
better than duck typing, just that it has its benefits at times.


On 2 Jan 2006 21:00:53 -0800, mi********@gmail.com <mi********@gmail.com> wrote:


Has anyone has issue with compiled re''s vis-a-vis the re.I (ignore
case) flag? I can''t make sense of this compiled re producing a
different match when given the flag, odd both in it''s difference from
the uncompiled regex (as I thought the uncompiled api was a wrapper
around a compile-and-execute block) and it''s difference from the
compiled version with no flag specified. The match given is utter
nonsense given the input re.



The re.compile and re.match methods take the flag parameter:

compile( pattern[, flags])
match( pattern, string[, flags])

But the regular expression object method takes different paramters:

match( string[, pos[, endpos]])

It''s not a little confusing that the parameters to re.match() and
re.compile().match() are so different, but that''s the cause of what
you''re seeing.

You need to do:

reCompiled = re.compile(reStr, re.I)
reCompiled.match(against).groups()

to get the behaviour you want.

Andrew


>>>>> mike klaas <mi********@gmail.com> writes:

In [48]: import re
In [49]: reStr = r"([a-z]+)://"
In [51]: against = "http://www.hello.com"
In [53]: re.match(reStr, against).groups()
Out[53]: (''http'',)
In [54]: re.match(reStr, against, re.I).groups()
Out[54]: (''http'',)
In [55]: reCompiled = re.compile(reStr)
In [56]: reCompiled.match(against).groups()
Out[56]: (''http'',)
In [57]: reCompiled.match(against, re.I).groups()
Out[57]: (''tp'',)



I can reproduce this on Debian Linux testing, both python 2.3 and python
2.4. Seems like a bug. search() also exhibits the same behavior.

Ganesan
--
Ganesan Rajagopal (rganesan at debian.org) | GPG Key: 1024D/5D8C12EA
Web: http://employees.org/~rganesan | http://rganesan.blogspot.com


这篇关于正则表达式异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆