构建基于正则表达式的解析器 [英] Building a Regex Based Parser
问题描述
构建基于正则表达式的解析器很愚蠢吗?
Is it stupid to build a regex based parser?
推荐答案
使用现代模式匹配嵌套的paren非常简单.不算空格,这种事情:
Matching nested parens is exceedingly simple using modern patterns. Not counting whitespace, this sort of thing:
\( (?: [^()] *+ | (?0) )* \)
适用于Perl和PHP等主流语言,以及使用PCRE的任何语言.
works for mainstream languages like Perl and PHP, plus anything that uses PCRE.
但是,您确实需要语法正则表达式才能进行完整的解析,否则您会发疯的.请勿使用其正则表达式不支持将正则表达式分解为更小的单元,或者不支持对其编译和执行进行适当调试的语言.对于低级的黑客来说,生活太短了.如果您愿意的话,也可以回到汇编语言.
However, you really need grammatical regexes for a full parse, or you’ll go nuts. Don’t use a language whose regexes don’t support breaking regexes down into smaller units, or which don’t support proper debugging of their compilation and execution. Life’s too short for low-level hackery. Might as well go back to assembly language if you’re going to do that.
我已经写了很多关于递归模式,语法模式和解析的文章:例如,请参见此处提供了词法分析方法;另外,此处的最终解决方案
I’ve written about recursive patterns, grammatical patterns, and parsing quite a bit: for example, see here for parsing approaches and here for lexer approaches; also, the final solution here.
此外,Perl的 Regexp::Grammars
模块在将语法正则表达式转换为解析结构时特别有用.
Also, Perl’s Regexp::Grammars
module is especially useful in turning grammatical regexes into parsing structures.
所以一定要去做.这样您将学到很多东西.
So by all means, go for it. You’ll learn a lot that way.
这篇关于构建基于正则表达式的解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!