重复多个字符正则表达式 [英] repeating multiple characters regex

查看:79
本文介绍了重复多个字符正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法使用正则表达式来匹配一组重复的字符?例如:

ABCABCABCABCABC

ABC{5}

我知道这是错误的.但是有什么可以匹配这种效果的吗?

更新:

您可以使用嵌套的捕获组吗?所以像 (?(ABC){5}) ?

解决方案

将要重复的正则表达式括在括号中.例如,如果您想要重复 5 次 ABC:

(ABC){5}

或者如果您想要任意数量的重复(0 次或更多):

(ABC)*

或重复一次或多次:

(ABC)+

编辑以响应更新

正则表达式中的括号有两件事;它们将正则表达式中的一系列项目组合在一起,以便您可以将运算符应用于整个序列,而不仅仅是最后一个项目,并且它们捕获该组的内容,以便您可以提取与该子表达式匹配的子字符串在正则表达式中.

你可以嵌套括号;它们是从第一个开头括号开始计算的.例如:

<预><代码>>>>re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(0)'123 ABCDEF'>>>re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(1)'ABCDEF'>>>re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(2)'DEF'

如果您想在分组时避免捕获,可以使用 (?:.如果您不想使用括号将只是用于分组序列用于应用运算符来更改匹配项的编号.它也更快.

<预><代码>>>>re.search('[0-9]* (?:ABC(...))', '123 ABCDEF 456').group(1)'DEF'

所以要回答您的更新,是的,您可以使用嵌套捕获组,甚至根本避免使用内部组进行捕获:

<预><代码>>>>re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(1)'ABCABCABCABCABC'>>>re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(2)'DEF'

Is there a way using a regex to match a repeating set of characters? For example:

ABCABCABCABCABC

ABC{5}

I know that's wrong. But is there anything to match that effect?

Update:

Can you use nested capture groups? So Something like (?<cap>(ABC){5}) ?

解决方案

Enclose the regex you want to repeat in parentheses. For instance, if you want 5 repetitions of ABC:

(ABC){5}

Or if you want any number of repetitions (0 or more):

(ABC)*

Or one or more repetitions:

(ABC)+

edit to respond to update

Parentheses in regular expressions do two things; they group together a sequence of items in a regular expression, so that you can apply an operator to an entire sequence instead of just the last item, and they capture the contents of that group so you can extract the substring that was matched by that subexpression in the regex.

You can nest parentheses; they are counted from the first opening paren. For instance:

>>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(0)
'123 ABCDEF'
>>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(1)
'ABCDEF'
>>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(2)
'DEF'

If you would like to avoid capturing when you are grouping, you can use (?:. This can be helpful if you don't want parentheses that you're just using to group together a sequence for the purpose of applying an operator to change the numbering of your matches. It is also faster.

>>> re.search('[0-9]* (?:ABC(...))', '123 ABCDEF 456').group(1)
'DEF'

So to answer your update, yes, you can use nested capture groups, or even avoid capturing with the inner group at all:

>>> re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(1)
'ABCABCABCABCABC'
>>> re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(2)
'DEF'

这篇关于重复多个字符正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆