Python re.sub 反向引用而不是反向引用 [英] Python re.sub back reference not back referencing

查看:52
本文介绍了Python re.sub 反向引用而不是反向引用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下几点:

<text top="52" left="20" width="383" height="15" font="0"><b>test</b></text>

我有以下几点:

fileText = re.sub("(.*?)", "\1", fileText, flags=re.DOTALL)

其中 fileText 是我上面发布的字符串.当我在运行正则表达式替换后打印出 fileText 时,我回来了

而不是预期

<text top="52" left="20" width="383" height="15" font="0">test</text>

现在我相当精通正则表达式,我知道它应该可以工作,事实上我知道它正确匹配,因为我可以在 groups 中看到它,当我进行搜索并打印出groups 但我是 python 新手,我很困惑为什么它不能正确处理反向引用

解决方案

您需要使用 此处使用原始字符串,以便反斜杠不会作为转义字符处理:

<预><代码>>>>进口重新>>>fileText = '<text top="52" left="20" width="383" height="15" font="0"><b>test</b></text>'>>>fileText = re.sub("<b>(.*?)</b>", r"\1", fileText, flags=re.DOTALL)>>>文件文本'<text top="52" left="20" width="383" height="15" font="0">test</text>'>>>

注意如何将 "\1" 更改为 r"\1".虽然这是一个很小的变化(一个字符),但它有很大的影响.见下文:

<预><代码>>>>\1"'\x01'>>>r"\1"'\\1'>>>

I have the following:

<text top="52" left="20" width="383" height="15" font="0"><b>test</b></text>

and I have the following:

fileText = re.sub("<b>(.*?)</b>", "\1", fileText, flags=re.DOTALL)

In which fileText is the string I posted above. When I print out fileText after I run the regex replacement I get back

<text top="52" left="20" width="383" height="15" font="0"></text>

instead of the expected

<text top="52" left="20" width="383" height="15" font="0">test</text>

Now I am fairly proficient at regex and I know that it should work, in fact I know that it matches properly because I can see it in the groups when I do a search and print out the groups but I am new to python and am confused as to why its not working with back references properly

解决方案

You need to use a raw-string here so that the backslash isn't processed as an escape character:

>>> import re
>>> fileText = '<text top="52" left="20" width="383" height="15" font="0"><b>test</b></text>'
>>> fileText = re.sub("<b>(.*?)</b>", r"\1", fileText, flags=re.DOTALL)
>>> fileText
'<text top="52" left="20" width="383" height="15" font="0">test</text>'
>>>

Notice how "\1" was changed to r"\1". Though it is a very small change (one character), it has a big effect. See below:

>>> "\1"
'\x01'
>>> r"\1"
'\\1'
>>>

这篇关于Python re.sub 反向引用而不是反向引用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆