无法在 re.sub() repr 表达式中的函数调用中使用 '\1' 反向引用来捕获组 [英] Can't use '\1' backreference to capture-group in a function call in re.sub() repr expression

查看:49
本文介绍了无法在 re.sub() repr 表达式中的函数调用中使用 '\1' 反向引用来捕获组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串 S = '02143' 和一个列表 A = ['a','b','c','d','e'].我想用列表 A 中的相应元素替换S"中的所有数字.

I have a string S = '02143' and a list A = ['a','b','c','d','e']. I want to replace all those digits in 'S' with their corresponding element in list A.

例如,将 0 替换为 A[0],将 2 替换为 A[2] 等等上.最终输出应该是 S = 'acbed'.

For example, replace 0 with A[0], 2 with A[2] and so on. Final output should be S = 'acbed'.

我试过了:

S = re.sub(r'([0-9])', A[int(r'\g<1>')], S)

然而,这会产生一个错误 ValueError: invalid literal for int() with base 10: '\\g<1>'.我猜它正在考虑将反向引用 '\g<1>' 作为字符串.我该如何解决这个问题,尤其是使用 re.sub 和捕获组,或者其他方法?

However this gives an error ValueError: invalid literal for int() with base 10: '\\g<1>'. I guess it is considering backreference '\g<1>' as a string. How can I solve this especially using re.sub and capture-groups, else alternatively?

推荐答案

re.sub(r'([0-9])',A[int(r'\g<1>') 的原因)],S) 不起作用是 \g<1>(这是第一个反向引用的明确表示,否则写为 \1)反向引用仅在字符串替换模式中使用时才有效.如果你把它传递给另一个方法,它会看到"只是 \g<1> 文字字符串,因为 re 模块在那个时候没有任何机会评估它.re 引擎仅在匹配期间对其进行评估,但 A[int(r'\g<1>')] 部分在 re引擎尝试找到匹配项.

The reason the re.sub(r'([0-9])',A[int(r'\g<1>')],S) does not work is that \g<1> (which is an unambiguous representation of the first backreference otherwise written as \1) backreference only works when used in the string replacement pattern. If you pass it to another method, it will "see" just \g<1> literal string, since the re module won't have any chance of evaluating it at that time. re engine only evaluates it during a match, but the A[int(r'\g<1>')] part is evaluated before the re engine attempts to find a match.

这就是为什么可以在<中使用回调方法的原因strong>re.sub 作为替换参数:您可以将匹配的组值传递给任何外部方法以进行高级操作.

That is why it is made possible to use callback methods inside re.sub as the replacement argument: you may pass the matched group values to any external methods for advanced manipulation.

请参阅 re 文档:

See the re documentation:

re.sub(pattern, repl, string, count=0, flags=0)

如果 repl 是一个函数,它会为每一个非重叠的pattern 的出现.该函数采用单个匹配对象参数,并返回替换字符串.

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

使用

import re
S = '02143' 
A = ['a','b','c','d','e']
print(re.sub(r'[0-9]',lambda x: A[int(x.group())],S))

查看 Python 演示

注意你不需要用括号捕获整个模式,你可以用x.group()访问整个匹配.

Note you do not need to capture the whole pattern with parentheses, you can access the whole match with x.group().

这篇关于无法在 re.sub() repr 表达式中的函数调用中使用 '\1' 反向引用来捕获组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆