python中的正则表达式:是否有可能获得匹配、替换和最终字符串? [英] Regex in python: is it possible to get the match, replacement, and final string?
问题描述
要进行正则表达式替换,您需要提供三样东西:
For doing a regex substitution, there are three things that you give it:
- 匹配模式
- 替换模式
- 原始字符串
正则表达式引擎发现我感兴趣的三件事:
There are three things that the regex engine finds that are of interest to me:
- 匹配的字符串
- 替换字符串
- 最终处理的字符串
当使用 re.sub
时,最终的字符串是返回的内容.但是是否可以访问其他两个东西,匹配字符串和替换字符串?
When using re.sub
, the final string is what's returned. But is it possible to access the other two things, the matched string and replacement string?
这是一个例子:
orig = "This is the original string."
matchpat = "(orig.*?l)"
replacepat = "not the \\1"
final = re.sub(matchpat, replacepat, orig)
print(final)
# This is the not the original string
匹配字符串是"original"
,替换字符串是"not the original"
.有没有办法得到它们?我正在编写一个脚本来搜索和替换许多文件,我希望它打印它正在查找和替换的内容,而不打印整行.
The match string is "original"
and the replacement string is "not the original"
. Is there a way to get them? I'm writing a script to to search and replace in many files, and I want it to print it what it's finding and replacing, without printing out the entire line.
推荐答案
class Replacement(object):
def __init__(self, replacement):
self.replacement = replacement
self.matched = None
self.replaced = None
def __call__(self, match):
self.matched = match.group(0)
self.replaced = match.expand(self.replacement)
return self.replaced
>>> repl = Replacement('not the \\1')
>>> re.sub('(orig.*?l)', repl, 'This is the original string.')
'This is the not the original string.'
>>> repl.matched
'original'
>>> repl.replaced
'not the original'
<小时>
正如@F.J 指出的那样,上面只会记住最后一次匹配/替换.此版本处理多次出现:
as @F.J has pointed out, the above will remember only the last match/replacement. This version handles multiple occurrences:
class Replacement(object):
def __init__(self, replacement):
self.replacement = replacement
self.occurrences = []
def __call__(self, match):
matched = match.group(0)
replaced = match.expand(self.replacement)
self.occurrences.append((matched, replaced))
return replaced
>>> repl = Replacement('[\\1]')
>>> re.sub('\s(\d)', repl, '1 2 3')
'1[2][3]'
>>> for matched, replaced in repl.occurrences:
....: print matched, '=>', replaced
....:
2 => [2]
3 => [3]
这篇关于python中的正则表达式:是否有可能获得匹配、替换和最终字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!