是否可以从另一个调用一个yacc解析器来解析特定的令牌子流? [英] Is it possible to call one yacc parser from another to parse specific token substream?

查看:113
本文介绍了是否可以从另一个调用一个yacc解析器来解析特定的令牌子流?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我已经有一个完整的YACC语法.例如,以C语法为例.现在,我想用简单的语法为特定于领域的语言创建一个单独的解析器,除了它仍然需要解析完整的C类型声明.我不想使用相关的处理代码来复制原始语法中的长规则,而是想调用原始解析器来精确地处理一个规则(我们称其为声明符").

如果它是递归下降解析器,则每个规则都有一个函数,易于调用.但是YACC隐式堆栈自动机又如何呢?

解决方案

基本上没有.编写LR语法并不容易,而且bison不会提供太多帮助.

但是一切并没有丢失.没有什么会阻止您包括整个语法(%start声明除外),而只是使用其中的一部分,只有一个小细节:野牛会抱怨无用的产生.

如果这对您来说是个绝妙的选择,那么您可以使用技巧来创建带有多个开始规则的语法.实际上,您可以创建一个语法,使您可以在每次调用解析器时指定起始符号.甚至不必将其烘焙.然后您可以将其塞入库中,并使用所需的任何解析器.

当然,这也是有代价的:代价是解析器比原本需要的更大.但是,它应该不会变慢,或者至少不会变慢-可能会有一些缓存影响-并且与编译器的其余部分相比,额外的大小可能微不足道.

野牛常见问题解答中对此黑客进行了描述非常详细,因此我将在此处做一个概述:对于要支持的每个起始生产,您都将创建一个额外的生产,该生产以伪令牌(即永远不会生成的词法代码)开头由词法分析器).例如,您可以执行以下操作:

%start meta_start
%token START_C START_DSL

meta_start: START_C c_start | START_DSL dsl_start;

现在,您只需要安排词法分析器在其首次启动时就产生适当的START令牌.有多种方法可以做到这一点. FAQ建议使用全局变量,但是如果您使用可重入的Flex扫描仪,则只需将所需的起始令牌置于扫描仪状态(连同发送起始令牌时设置的标志).

Suppose I already have a complete YACC grammar. Let that be C grammar for example. Now I want to create a separate parser for domain-specific language, with simple grammar, except that it still needs to parse complete C type declarations. I wouldn't like to duplicate long rules from the original grammar with associated handling code, but instead would like to call out to the original parser to handle exactly one rule (let's call it "declarator").

If it was a recursive descent parser, there would be a function for each rule, easy to call in. But what about YACC with its implicit stack automaton?

解决方案

Basically, no. Composing LR grammars is not easy, and bison doesn't offer much help.

But all is not lost. Nothing stops you from including the entire grammar (except the %start declaration), and just using part of it, except for one little detail: bison will complain about useless productions.

If that's a show-stopper for you, then you can use a trick to make it possible to create a grammar with multiple start rules. In fact, you can create a grammar which lets you specify the start symbol every time you call the parser; it doesn't even have to be baked in. Then you can tuck that into a library and use whichever parser you want.

Of course, this also comes at a cost: the cost is that the parser is bigger than it would otherwise need to be. However, it shouldn't be any slower, or at least not much -- there might be some cache effects -- and the extra size is probably insignificant compared to the rest of your compiler.

The hack is described in the bison FAQ in quite a lot of detail, so I'll just do an outline here: for each start production you want to support, you create one extra production which starts with a pseudo-token (that is, a lexical code which will never be generated by the lexer). For example, you might do the following:

%start meta_start
%token START_C START_DSL

meta_start: START_C c_start | START_DSL dsl_start;

Now you just have to arrange for the lexer to produce the appropriate START token when it first starts up. There are various ways to do that; the FAQ suggests using a global variable, but if you use a re-entrant flex scanner, you can just put the desired start token in the scanner state (along with a flag which is set when the start token has been sent).

这篇关于是否可以从另一个调用一个yacc解析器来解析特定的令牌子流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆