正则表达式匹配 A、B 和 AB [英] Regular expression matching A, B, and AB

查看:446
本文介绍了正则表达式匹配 A、B 和 AB的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个匹配ABAB的正则表达式,其中AB 是相当复杂的正则表达式.

一种解决方案是使用 (A|A?B)(AB?|B),但是我必须重复其中一个表达式.>

A?B? 不起作用,因为这也匹配空字符串.

是否可以在不重复AB 的情况下创建此正则表达式?

解决方案

一般来说是不可能的.不过,您可以使用一些解决方法.

如果AB以单词字符开头和结尾

如果AB 是或以字型字符(字母、数字或_代码>,你可以使用

(?

查看正则表达式演示

  • (?<!\w) - 前面不允许有单词字符
  • A? - 一个可选的 A
  • (?:B)? - 一个可选的 B
  • (?!\w) - 后面不允许出现字符字符(此时,我们可以匹配字符串开头和非字符字符之间、非字符字符之间的空字符串和字符串结尾或两个非单词字符之间,因此我们添加...)
  • (?<!\W(?!\w)) - 如果之前是一个非单词字符且后面没有字符字符,则不允许匹配(这会取消空匹配)在两个非单词字符和一个非单词字符和字符串结尾之间)
  • (?<!^(?!\w)) - 如果后面没有字符字符,则字符串开头不允许匹配.

避免在基于交替的模式中重复部分表达式

在 PCRE 中,您可以避免重复相同的模式部分,因为您可能会通过子例程调用递归子模式:

A(?B)?|(?&BGroup)

请参阅正则表达式演示.

(?B) 是一个命名的捕获组,其模式与 (?&BGroup) 命名的子程序调用重复.

请参阅递归模式.

I would like to create a regular expression that matches A, B, and AB, where A and B are quite complex regular expressions.

One solution is to use (A|A?B) or (AB?|B), but then I have to repeat one of the expressions.

A?B? does not work, since this also matches the empty string.

Is it possible to create this regular expression without repeating neither A nor B?

解决方案

In general, it is not possible. You may use some workarounds though.

If A and B start and end with word characters

In case the A and B are or start/end in word type characters (letters, digits or _, you may use

(?<!\w)A?(?:B)?(?!\w)(?<!\W(?!\w))(?<!^(?!\w))

See the regex demo

  • (?<!\w) - no word character allowed before
  • A? - an optional A
  • (?:B)? - an optional B
  • (?!\w) - no word char is allowed right after (at this point, we may match empty strings between start of string and a non-word char, between a non-word and end of string or between two non-word chars, hence we add...)
  • (?<!\W(?!\w)) - no match allowed if right before is a non-word char that is not followed with a word char (this cancels empty matches between two non-word chars and a non-word char and end of string)
  • (?<!^(?!\w)) - no match allowed at the start of string if not followed with a word char.

Avoid repeating part of the expression in an alternation based pattern

In PCRE, you may avoid repeating the same pattern part since you may recurse subpatterns with subroutine calls:

A(?<BGroup>B)?|(?&BGroup)

See the regex demo.

The (?<BGroup>B) is a named capturing group whose pattern is repeated with the (?&BGroup) named subroutine call.

See Recursive patterns.

这篇关于正则表达式匹配 A、B 和 AB的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆