使用 Python 替换或交换文件中的子字符串 [英] Using Python to substitute or swap substrings in a file

查看:45
本文介绍了使用 Python 替换或交换文件中的子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我在以下形式的 ASCII 文件中有一行:

{text1} {stringA} {text2} {stringB} {text3}

其中 {stringA}{stringB} 是感兴趣的子串.我们分别称它们为A"和B".字符串 {text1}{text2}{text3} 是任意长度(可能为空)的字符串,其中不包含 A或 B.

我想在 Python 中做的只是交换 A 和 B,使行从

{text1} {stringA} {text2} {stringB} {text3}

{text1} {stringB} {text2} {stringA} {text3}

我很感激这里的任何帮助.我认为通过在这个问题上获得帮助,它将帮助我学会更好地使用 Python 中的正则表达式.

请注意,{text1}{text2}{text3} 是未知字符串.

我们确切地知道子串 A 和 B.我们知道 A 在行中排在 B 之前.但是,我们不知道它们之前/之间/之后是什么(如果有的话).

示例(A=John,B=Tim):

(1) 这个:

我告诉约翰把包交给蒂姆."

改为这样:

我告诉蒂姆把包交给约翰."

(2) 这个:

约翰向蒂姆问好."

改为这样:

蒂姆向约翰问好."

(3) 这个:

约翰!h9aghagTim"

改为这样:

蒂姆!h9aghagJohn"

解决方案

>>>进口重新>>>text = '{text1} {stringA} {text2} {stringB} {text3}'>>>re.sub(r'(stringA)(.*)(stringB)', r'\3\2\1', 文本)'{text1} {stringB} {text2} {stringA} {text3}'

用您感兴趣的子字符串替换 stringAstringB,注意您可能想要 re.escape() 它们,以防子字符串在正则表达式中具有特殊含义的字符.

测试用例:

<预><代码>>>>stringA = '约翰'>>>stringB = '蒂姆'>>>正则表达式 = re.compile(r'(%s)(.*)(%s)' % (stringA, stringB))>>>regex.sub(r'\3\2\1', "我告诉约翰把包交给蒂姆.")我告诉蒂姆把包交给约翰.">>>regex.sub(r'\3\2\1', "约翰向蒂姆问好.")蒂姆向约翰问好.">>>regex.sub(r'\3\2\1', "John!h9aghagTim")'Tim!h9aghagJohn'

Suppose I have a line in an ASCII file of the following form:

{text1} {stringA} {text2} {stringB} {text3}

where {stringA} and {stringB} are substrings of interest. Let's call them "A" and "B" respectively. The strings {text1}, {text2}, and {text3} are strings of any length (possibly empty) that do not contain either A or B.

What I want to do in Python is simply swap A and B such that the line goes from

{text1} {stringA} {text2} {stringB} {text3}

to

{text1} {stringB} {text2} {stringA} {text3}

I'd appreciate any help here. I think that by getting help on this question, it will help me learn to better work with regular expressions in Python.

Note that {text1}, {text2}, and {text3} are unknown strings.

We know exactly the substrings A and B. We know that A precedes B in the line. However, we don't know what (if anything) is before/between/after them.

Examples (A=John, B=Tim):

(1) This:

"I told John to give the bag to Tim."

is changed to this:

"I told Tim to give the bag to John."

(2) This:

"John said hello to Tim."

is changed to this:

"Tim said hello to John."

(3) This:

"John!h9aghagTim"

is changed to this:

"Tim!h9aghagJohn"

解决方案

>>> import re
>>> text = '{text1} {stringA} {text2} {stringB} {text3}'
>>> re.sub(r'(stringA)(.*)(stringB)', r'\3\2\1', text)
'{text1} {stringB} {text2} {stringA} {text3}'

Replace stringA and stringB with your substrings of interest, note that you may want to re.escape() them in case the substrings can have characters with a special meaning in regex.

Test cases:

>>> stringA = 'John'
>>> stringB = 'Tim'
>>> regex = re.compile(r'(%s)(.*)(%s)' % (stringA, stringB))
>>> regex.sub(r'\3\2\1', "I told John to give the bag to Tim.")
'I told Tim to give the bag to John.'
>>> regex.sub(r'\3\2\1', "John said hello to Tim.")
'Tim said hello to John.'
>>> regex.sub(r'\3\2\1', "John!h9aghagTim")
'Tim!h9aghagJohn'

这篇关于使用 Python 替换或交换文件中的子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆