在运行时从语法构建语法分析器 [英] Build parser from grammar at runtime

查看:153
本文介绍了在运行时从语法构建语法分析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

C ++的许多(大多数)正则表达式库允许在运行时从字符串创建表达式。有人知道任何C ++解析器生成器,允许在运行时将一个表示为字符串的语法(最好是BNF)提供给一个生成器吗?所有的实现我发现要么需要一个明确的代码生成器来运行或需要通过巧妙的模板元程序表达的语法。

解决方案

它应该很容易构建一个递归下降,回溯解析器接受一个语法作为输入。您可以将所有规则减少为以下形式(或者就像您有的一样):

  A = B C D; 

通过递归下降解析这样的规则很容易:调用对应于找到B的例程,然后一个找到一个C,然后找到一个D.假定你正在做一个一般的解析器,你可以总是调用一个parse_next_sentential_form(x)函数,并传递所需的形式(终端或非终止令牌)的名称为x (例如B,C,D)。



在处理这样的规则时,解析器想要通过找到B,然后是C,然后是D.要找到B(或C或D),你想要一个索引的规则集,其中所有的左侧是相同的,所以一个人可以很容易地枚举B生成规则,并递归处理其内容。如果你的解析器失败,它只是回溯。



这不会是一个快速的解析器,但是如果良好实现不应该是可怕的。

也可以使用Earley解析器,通过创建部分处理规则的状态进行解析。



如果你想要快速,我想你可以简单地采取野牛的胆量,使它成为一个图书馆。然后,如果你有语法文本或语法规则(不同的入口点到Bison),你可以启动它,并在内存中生成它的表(它必须以某种形式)。不要吐出来;只需构建使用它们的LR解析引擎。 Voila,即时有效解析器生成。
如果你这样做,你必须担心语法的模糊性和LALR(1)。前两个解决方案使用任何上下文无关语法。


Many (most) regular expression libraries for C++ allow for creating the expression from a string during runtime. Is anyone aware of any C++ parser generators that allow for feeding a grammar (preferably BNF) represented as a string into a generator at runtime? All the implementations I've found either require an explicit code generator to be run or require the grammar to be expressed via clever template meta-programming.

解决方案

It should be pretty easy to build a recursive descent, backtracking parser that accepts a grammar as input. You can reduce all your rules to the following form (or act as if you have):

 A = B C D ;

Parsing such a rule by recursive descent is easy: call a routine that corresponds to finding a B, then one that finds a C, then one that finds a D. Given you are doing a general parser, you can always call a "parse_next_sentential_form(x)" function, and pass the name of the desired form (terminal or nonterminal token) as x (e.g., "B", "C", "D").

In processing such a rule, the parser wants to produce an A, by finding a B, then C, then D. To find B (or C or D), you'd like to have an indexed set of rules in which all the left-hand sides are the same, so one can easily enumerate the B-producing rules, and recurse to process their content. If your parser gets a failure, it simply backtracks.

This won't be a lightning fast parser, but shouldn't be terrible if well implemented.

One could also use an Earley parser, which parses by creating states of partially-processed rules.

If you wanted it to be fast, I suppose you could simply take the guts of Bison and make it into a library. Then if you have grammar text or grammar rules (different entry points into Bison), you could start it and have it produce its tables in memory (which it must do in some form). Don't spit them out; simply build an LR parsing engine that uses them. Voila, on-the-fly efficient parser generation. You have to worry about ambiguities and the LALR(1)ness of your grammar if you do this; the previous two solutions work with any context free grammar.

这篇关于在运行时从语法构建语法分析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆