Python正则表达式删除所有方括号及其内容 [英] Python regular expression to remove all square brackets and their contents

查看:76
本文介绍了Python正则表达式删除所有方括号及其内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用此正则表达式从字符串中删除方括号(以及其中的所有内容)的所有实例.例如,当字符串中只有一对方括号时,这会起作用:

导入重新模式 = r'\[[^()]*\]'s = """以萨迦是一头没有骨头的驴,躺在羊圈中."""t = re.sub(pattern, '', s)打印 t

我得到的是正确的:

>>>Issachar 是一头生骨毛驴,躺在羊圈中.

但是,如果我的字符串包含一组以上的方括号,则它不起作用.例如:

s = """Issachar 是一头没有骨头的 [a] 驴,躺在羊圈中.[b]"""

我明白了:

>>>>Issachar

无论字符串中有多少方括号,我都需要正则表达式才能工作.正确答案应该是:

>>>Issachar 是一头生骨毛驴,躺在羊圈中.

我研究并尝试了很多排列都无济于事.

解决方案

默认情况下 *(或 +)匹配贪婪,所以问题中给出的模式将匹配直到最后一个 ].

<预><代码>>>>re.findall(r'\[[^()]*\]', "Issachar 是一头没有骨头的 [a] 驴,躺在羊圈中.[b]")['[a] 驴躺在羊圈中.[b]']

通过在重复操作符(*)后附加?,可以使其匹配非贪婪方式.

<预><代码>>>>进口重新>>>模式 = r'\[.*?\]'>>>s = """以萨迦是一头没有骨头的[a]驴,躺在羊圈中.[b]""">>>re.sub(模式,'',s)以萨迦是躺在羊圈中的一头生骨驴."

I am trying to use this regular expression to remove all instances of square brackets (and everything in them) from strings. For example, this works when there is only one pair of square brackets in the string:

import re
pattern = r'\[[^()]*\]'
s = """Issachar is a rawboned[a] donkey lying down among the sheep pens."""
t = re.sub(pattern, '', s)
print t

What I get is correct:

>>>Issachar is a rawboned donkey lying down among the sheep pens.

However, if my string contains more than one set of square brackets, it doesn't work. For example:

s = """Issachar is a rawboned[a] donkey lying down among the sheep pens.[b]"""

I get:

>>>Issachar is a rawboned

I need the regular expression to work no matter how many square brackets are in the string. Correct answer should be:

>>>Issachar is a rawboned donkey lying down among the sheep pens.

I have researched and tried many permutations to no avail.

解决方案

By default * (or +) matches greedily, so the pattern given in the question will match upto the last ].

>>> re.findall(r'\[[^()]*\]', "Issachar is a rawboned[a] donkey lying down among the sheep pens.[b]")
['[a] donkey lying down among the sheep pens.[b]']

By appending ? after the repetition operator (*), you can make it match non-greedy way.

>>> import re
>>> pattern = r'\[.*?\]'
>>> s = """Issachar is a rawboned[a] donkey lying down among the sheep pens.[b]"""
>>> re.sub(pattern, '', s)
'Issachar is a rawboned donkey lying down among the sheep pens.'

这篇关于Python正则表达式删除所有方括号及其内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆