为什么 re.sub 替换整个模式,而不仅仅是其中的一个捕获组? [英] Why does re.sub replace the entire pattern, not just a capturing group within it?
本文介绍了为什么 re.sub 替换整个模式,而不仅仅是其中的一个捕获组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
re.sub('a(b)','d','abc')
产生 dc
,而不是 adc
.
为什么re.sub
要替换整个捕获组,而不是只替换捕获组'(b)'?
Why does re.sub
replace the entire capturing group, instead of just capturing group'(b)'?
推荐答案
因为它应该替换整个模式:
Because it's supposed to replace the whole occurrence of the pattern:
返回通过替换 repl 替换 string 中最左边不重叠的模式所获得的字符串.
Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl.
如果只替换某些子组,那么包含多个组的复杂正则表达式将不起作用.有几种可能的解决方案:
If it were to replace only some subgroup, then complex regexes with several groups wouldn't work. There are several possible solutions:
- 完整指定模式:
re.sub('ab', 'ad', 'abc')
- 我最喜欢的,因为它非常易读和明确. - 捕获您想要保留的组,然后在模式中引用它们(注意它应该是原始字符串以避免转义):
re.sub('(a)b', r'1d', 'abc')
- 类似于上一个选项:提供一个回调函数作为
repl
参数,并使其处理Match
对象并返回所需的结果. - 使用lookbehinds/lookaheds,它们不包含在匹配中,但会影响匹配:
re.sub('(?<=a)b', r'd', 'abxb')
产生adxb
.组开头的?<=
表示这是一个前瞻".
- Specify pattern in full:
re.sub('ab', 'ad', 'abc')
- my favorite, as it's very readable and explicit. - Capture groups which you want to preserve and then refer to them in the pattern (note that it should be raw string to avoid escaping):
re.sub('(a)b', r'1d', 'abc')
- Similar to previous option: provide a callback function as
repl
argument and make it process theMatch
object and return required result. - Use lookbehinds/lookaheds, which are not included in the match, but affect matching:
re.sub('(?<=a)b', r'd', 'abxb')
yieldsadxb
. The?<=
in the beginning of the group says "it's a lookahead".
这篇关于为什么 re.sub 替换整个模式,而不仅仅是其中的一个捕获组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文