为什么`(c*)|(cccd)` 匹配`ccc`,而不是`cccd`? [英] Why does `(c*)|(cccd)` match `ccc`, not `cccd`?

查看:30
本文介绍了为什么`(c*)|(cccd)` 匹配`ccc`,而不是`cccd`?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我以为我很了解正则表达式,但为什么匹配的是ccc",而不是cccd"?

<预><代码>>>>mo = re.match('(c*)|(cccd)', 'cccd')>>>mo.group(0)'抄送'

这个特殊情况使用 Python 的 re 模块.

解决方案

Regex 模式从左到右计算.将优先级高的模式放在第一位(|的左边),将优先级较低的放在第二个(|的右边)代码>).请注意,不允许第二个模式匹配已经与第一个模式匹配的文本.也就是说,默认情况下正则表达式引擎不会进行重叠匹配.要使正则表达式引擎进行重叠匹配,您需要将您的模式放入一个捕获组中,并再次将捕获组放入一个正向环视断言中(正向前瞻和正向后视).

mo = re.match('(cccd)|(c*)', 'cccd')

I thought I understood Regular Expressions pretty well, but why is this matching 'ccc', not 'cccd'?

>>> mo = re.match('(c*)|(cccd)', 'cccd')
>>> mo.group(0)
'ccc'

This particular case is using Python's re module.

解决方案

Regex patterns are evaluated from left to right. Put the pattern which has higher precedence as first (to the left of |) and the lower precedence as second (to the right of |). Note that the second pattern was not allowed to match the text which was already matched by the first pattern. That is, regex engine by default won't do overlapping matches. To make the regex engine to do overlapping match then you need to put your pattern inside a capturing group and again put the capturing group inside a positive lookaround assertion (positive lookahead and positive lookbehind).

mo = re.match('(cccd)|(c*)', 'cccd')

这篇关于为什么`(c*)|(cccd)` 匹配`ccc`,而不是`cccd`?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆