正则表达式解析3型语法 [英] Regexp parse type-3 grammar

查看:104
本文介绍了正则表达式解析3型语法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读乔木斯基层次结构 ... ...我知道regexp不能解析2型语法(无上下文语法),也不能解析1型和0型语法. 正则表达式可以解析/捕获所有3类语法(常规语法)吗?

Reading Chomsky hierarchy ... ... I know regexp can't parse type-2 grammars (context-free grammars), and also type-1 and type-0. Can regular expressions parse/catch ALL type-3 grammars (regular grammars)?

推荐答案

是的,只要它们支持交替,串联和Kleene星. PCRE(Perl/Java/JavaScript/PHP/...)类型的正则表达式就是这种情况:交替由((...)|(...))实现,串联由(...)(...)实现,而Kleene star由(...)*实现. (在大多数这些语言中,还有一些其他细节—您需要使用\A\z之类的东西来表示字符串开始"和字符串结束",以常规语法表示被认为是理所当然的—就是这个主意.)

Yes, provided they support alternation, concatenation, and the Kleene star. This is the case for regexes of the PCRE (Perl/Java/JavaScript/PHP/...) type: alternation is implemented by ((...)|(...)), concatenation by (...)(...), and the Kleene star by (...)*. (There are a few other details — in most of these languages you need to use something like \A and \z to indicate "start-of-string" and "end-of-string", which in a regular grammar is taken for granted — but that's the idea.)

但是在编程环境中并不是所有被称为正则表达式"的东西都必须具有上述所有内容;例如, POSIX基本正则表达式仅支持非常有限的交替形式,其中所有交替的分支"由一个字符组成(例如,PCRE同时具有(a|b|c)和特例等效的[abc],而POSIX BRE仅具有[abc],因此不能表示类似(ab|c)的内容)

But not everything called a "regular expression" in a programming context necessarily has all of the above; for example, POSIX Basic Regular Expressions supports only a very limited form of alternation, where all "branches" of the alternation consist of a single character (e.g., whereas PCREs has both (a|b|c) and the special-case-equivalent [abc], POSIX BREs only have [abc], so can't express something like (ab|c)).

这篇关于正则表达式解析3型语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆