正则表达式仅删除特定次数的字符 [英] Regexp to remove specific number of occurrences of character only

查看:57
本文介绍了正则表达式仅删除特定次数的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Python re 中,我有很长的文本字符串,其中包含不同长度的 > 字符块.一个字符串可以在中间有 3 个连续的 > 字符,在开头有 >>> 或任何这样的组合.

In Python re, I have long strings of text with > character chunks of different lengths. One string can have 3 consecutive > chars in the middle, >> in the beginning, or any such combination.

我想写一个正则表达式,在根据空格拆分字符串后,遍历每个单词以仅识别那些恰好出现 2 次的区域 >>,我不能确定它是在整个字符串的开头、中间还是结尾,或者它之前或之后是什么字符,或者它是否甚至是字符串中仅有的 2 个字符.

I want to write a regexp that, after splitting the string based on spaces, iterates through each word to only identify those regions with exactly 2 occurrences >>, and I can't be sure if it's at the beginning, middle or end of the whole string, or what characters are before or after it, or if it's even the only 2 characters in the string.

到目前为止,我可以想出:

So far I could come up with:

word = re.sub(r'>{2}', '', word)

这最终会删除所有出现的 2 个或更多.什么正则表达式可以满足这个要求?任何帮助表示赞赏.

This ends up removing all occurrences of 2 or more. What regular expression would work for this requirement? Any help is appreciated.

推荐答案

您需要使用一对 环视,前瞻和后视.总体方案是

You need to make sure there is no character of your choice both on the left and right using a pair of lookaround, a lookahead and a lookbehind. The general scheme is

(?<!X)X{n}(?!X)

其中 (?<!X) 表示 不允许紧靠左边的 X, X{n} 表示 n 次出现 X(?!X) 表示 没有 X 立即允许在右侧.

where (?<!X) means no X immediately on the left is allowed, X{n} means n occurrences of X, and (?!X) means no X immediately on the right is allowed.

在这种情况下,使用

r'(?<!>)>{2}(?!>)'

查看正则表达式演示.

这篇关于正则表达式仅删除特定次数的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆