为什么 re.sub 替换整个模式,而不仅仅是其中的一个捕获组? [英] Why does re.sub replace the entire pattern, not just a capturing group within it?

查看:26
本文介绍了为什么 re.sub 替换整个模式,而不仅仅是其中的一个捕获组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

re.sub('a(b)','d','abc') 产生 dc,而不是 adc.

为什么re.sub要替换整个捕获组,而不是只替换捕获组'(b)'?

Why does re.sub replace the entire capturing group, instead of just capturing group'(b)'?

推荐答案

因为它应该替换整个模式:

Because it's supposed to replace the whole occurrence of the pattern:

返回通过替换 repl 替换 string 中最左边不重叠的模式所获得的字符串.

Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl.

如果只替换某些子组,那么包含多个组的复杂正则表达式将不起作用.有几种可能的解决方案:

If it were to replace only some subgroup, then complex regexes with several groups wouldn't work. There are several possible solutions:

  1. 完整指定模式:re.sub('ab', 'ad', 'abc') - 我最喜欢的,因为它非常易读和明确.
  2. 捕获您想要保留的组,然后在模式中引用它们(注意它应该是原始字符串以避免转义):re.sub('(a)b', r'1d', 'abc')
  3. 类似于上一个选项:提供一个回调函数作为 repl 参数,并使其处理 Match 对象并返回所需的结果.
  4. 使用lookbehinds/lookaheds,它们不包含在匹配中,但会影响匹配:re.sub('(?<=a)b', r'd', 'abxb') 产生 adxb.组开头的 ?<= 表示这是一个前瞻".
  1. Specify pattern in full: re.sub('ab', 'ad', 'abc') - my favorite, as it's very readable and explicit.
  2. Capture groups which you want to preserve and then refer to them in the pattern (note that it should be raw string to avoid escaping): re.sub('(a)b', r'1d', 'abc')
  3. Similar to previous option: provide a callback function as repl argument and make it process the Match object and return required result.
  4. Use lookbehinds/lookaheds, which are not included in the match, but affect matching: re.sub('(?<=a)b', r'd', 'abxb') yields adxb. The ?<= in the beginning of the group says "it's a lookahead".

这篇关于为什么 re.sub 替换整个模式,而不仅仅是其中的一个捕获组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆