使用替换函数时,为什么反向引用在 Python 的 re.sub 中不起作用? [英] Why don't backreferences work in Python's re.sub when using a replacement function?

查看:51
本文介绍了使用替换函数时,为什么反向引用在 Python 的 re.sub 中不起作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Python 2.7 中使用 re.sub,以下示例使用简单的反向引用:

Using re.sub in Python 2.7, the following example uses a simple backreference:

re.sub('-{1,2}', r'\g<0> ', 'pro----gram-files')

它按预期输出以下字符串:

It outputs the following string as expected:

'pro-- -- gram- files'

我希望以下示例是相同的,但事实并非如此:

I would expect the following example to be identical, but it is not:

def dashrepl(matchobj):
    return r'\g<0> '
re.sub('-{1,2}', dashrepl, 'pro----gram-files')

这给出了以下意外输出:

This gives the following unexpected output:

'pro\\g<0> \\g<0> gram\\g<0> files'

为什么这两个例子给出了不同的输出?我错过了解释这一点的文档中的某些内容吗?这种行为是否比我预期的更可取?有没有办法在替换函数中使用反向引用?

Why do the two examples give different output? Did I miss something in the documentation that explains this? Is there any particular reason that this behavior is preferable to what I expected? Is there a way to use backreferences in a replacement function?

推荐答案

因为有更简单的方法可以实现您的目标,所以您可以使用它们.

As there are simpler ways to achieve your goal, you can use them.

正如您已经看到的,您的替换函数将一个匹配对象作为它的参数.

As you already see, your replacement function gets a match object as it argument.

这个对象有一个方法 group() 可以用来代替:

This object has, among others, a method group() which can be used instead:

def dashrepl(matchobj):
    return matchobj.group(0) + ' '

这将准确给出您的结果.

which will give exactly your result.

但您说得完全正确 - 文档 有点令人困惑这样:

But you are completely right - the docs are a bit confusing in that way:

它们描述了 repl 参数:

repl 可以是字符串或函数;如果是字符串,则处理其中的任何反斜杠转义.

repl can be a string or a function; if it is a string, any backslash escapes in it are processed.

如果 repl 是一个函数,它会在每次不重叠的模式出现时被调用.该函数采用单个匹配对象参数,并返回替换字符串.

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

可以将其解释为函数返回的替换字符串"也适用于反斜杠转义的处理.

You could interpret this as if "the replacement string" returned by the function would also apply to the processment of backslash escapes.

但是由于这个处理过程只针对是字符串"的情况进行描述,所以比较清晰,但乍一看并不明显.

But as this processment is described only for the case that "it is a string", it becomes clearer, but not obvious at the first glance.

这篇关于使用替换函数时,为什么反向引用在 Python 的 re.sub 中不起作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆