使用正则表达式(PCRE)匹配a ^ n b ^ n c ^ n(例如"aaabbbccc") [英] Match a^n b^n c^n (e.g. "aaabbbccc") using regular expressions (PCRE)

查看:158
本文介绍了使用正则表达式(PCRE)匹配a ^ n b ^ n c ^ n(例如"aaabbbccc")的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

众所周知,现代正则表达式实现(最著名的是PCRE)与无上下文语法 {a n b n ; n> 0}(例如aaabbb)使用此正则表达式(演示):

It is a well known fact that modern regular expression implementations (most notably PCRE) have little in common with the original notion of regular grammars. For example you can parse the classical example of a context-free grammar {anbn; n>0} (e.g. aaabbb) using this regex (demo):

~^(a(?1)?b)$~

我的问题是:你能走多远?是否还可以解析上下文相关语法 {a n b n c n ; n> 0}(例如,aaabbbccc)使用PCRE吗?

My question is: How far can you go? Is it also possible to parse the context-sensitive grammar {anbncn;n>0} (e.g. aaabbbccc) using PCRE?

推荐答案

受到NullUserExceptions答案的启发(他已经删除了,因为它在一种情况下失败了),我想我自己也找到了解决方案:

Inspired by NullUserExceptions answer (which he already deleted as it failed for one case) I think I have found a solution myself:

$regex = '~^
    (?=(a(?-1)?b)c)
     a+(b(?-1)?c)
$~x';

var_dump(preg_match($regex, 'aabbcc'));    // 1
var_dump(preg_match($regex, 'aaabbbccc')); // 1
var_dump(preg_match($regex, 'aaabbbcc'));  // 0
var_dump(preg_match($regex, 'aaaccc'));    // 0
var_dump(preg_match($regex, 'aabcc'));     // 0
var_dump(preg_match($regex, 'abbcc'));     // 0

亲自尝试: http://codepad.viper-7.com/1erq9v

如果考虑不带正向超前断言((?=...)部分)的正则表达式,则具有以下条件:

If you consider the regex without the positive lookahead assertion (the (?=...) part), you have this:

~^a+(b(?-1)?c)$~

这无非是检查是否有任意数量的a,然后是相等数量的b s和c s.

This does nothing more than check that there's an arbitrary number of as, followed by an equal number of bs and cs.

这还不能满足我们的语法要求,因为a的数量也必须相同.我们可以通过检查a s的数量等于b s的数量来确保.这就是前瞻断言中的表达式的作用:(a(?-1)?b)c. c是必需的,因此我们不仅要匹配b的一部分.

This doesn't yet satisfy our grammar, because the number of as must be the same, too. We can ensure that by checking that the number of as equals the number of bs. And this is what the expression in the lookahead assertion does: (a(?-1)?b)c. The c is necessary so we don't only match a part of the bs.

我认为这令人印象深刻,这表明现代正则表达式不仅能够解析非常规语法,而且甚至可以解析非上下文无关的语法.希望这将消除您不能用正则表达式来执行X,因为X不规则"的无休止的模仿.

I think this impressively shows that modern regex is not only capable of parsing non-regular grammars, but can even parse non-context-free grammars. Hopefully this will lay to rest the endless parroting of "you can't do X with regex because X isn't regular"

这篇关于使用正则表达式(PCRE)匹配a ^ n b ^ n c ^ n(例如"aaabbbccc")的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆