C变体语法上的Shift/Reduce冲突 [英] Shift/Reduce conflict on C variation grammar

查看:85
本文介绍了C变体语法上的Shift/Reduce冲突的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为类似C的语法编写解析器,但是遇到了移位/减少冲突的问题:

I am writing a parser to a C-like grammar, but I am having a problem with a shift/reduce conflict:

基本上,语法接受一系列可选的全局变量声明,后跟函数.

Basically, the grammar accept a list of optional global variables declarations followed by the functions.

我有以下规则:

program: global_list function_list;

type_name : TKINT /* int */
          | TKFLOAT /* float */
          | TKCHAR /* char */

global_list : global_list var_decl ';'
            |
            ;

var_decl : type_name NAME;

function_list : function_list function_def
              |
              ;

function_def : type_name NAME '(' param_list ')' '{' func_body '}' ;

我知道我有一个问题,因为语法无法确定下一个 type_name NAME 是属于global_list还是function_list,并且默认情况下它期望有一个global_list

I understand that I have a problem because the grammar can't decide if the next type_name NAME belongs to global_list or function_list, and by default it is expecting a global_list

例如:

int var1;

int foo(){}

error: unexpcted '(', expecting ';'

推荐答案

问题是, function_def 只能在 function_list 之后出现,这意味着解析器需要减少空的 function_list (使用生产 function_list→ε ),以便可以识别 function_def .此外,它只需通过查看空生产之后的令牌来做出该决定.由于该令牌( type_name )可以启动 var_decl function_def ,因此解析器无法决定.

The problem is that a function_def can only occur after a function_list, which means that the parser needs to reduce an empty function_list (using the production function_list → ε) before it can recognize a function_def. Furthermore, it needs to make that decision by only looking at the token which follows the empty production. Since that token (a type_name) could start either a var_decl or a function_def, there is no way for the parser to decide.

就算再做出一个代币的决定也无济于事;直到第三个标记,才可以做出正确的决定.因此,您的语法不是模棱两可的,而是LR(3).

Even leaving the decision for one more token won't help; it's not until the third token that the correct decision can be made. So your grammar is not ambiguous, but it is LR(3).

可能为空的不同类型列表的序列始终会产生此问题.相比之下,非空列表的序列则不是,因此解决该问题的第一种方法是消除&epsilon-产生.

Sequences of possibly empty lists of different type always create this problem. By contrast, sequences of non-empty lists do not, so a first approach to solving the problem is to eliminate the ε-productions.

首先,我们扩展顶级定义以使两个列表都是可选的:

First, we expand the top-level definition to make it clear that both lists are optional:

program: global_list function_list;
       | global_list
       | function_list
       |
       ;

然后,我们将两种列表类型都设为非空:

Then we make both list types non-empty:

global_list
       : var_decl
       | global_list var_decl
       ;

function_list
       : function_def
       | function_list function_def
       ;

其余语法不变.

type_name : TKINT /* int */
          | TKFLOAT /* float */
          | TKCHAR /* char */

var_decl : type_name NAME;

function_def : type_name NAME '(' param_list ')' '{' func_body '}' ;

值得注意的是,如果可以穿插声明,则永远不会出现此问题.是否真的有必要在任何函数之前定义所有全局变量?如果没有,则可以只使用一种列表类型,这也将是无冲突的:

It's worth noting that the problem would never have arisen if declarations could be interspersed. Is it really necessary that all global variables be defined before any function? If not, you could just use a single list type, which would also be conflict free:

program: decl_list ;

decl_list:
         | decl_list var_decl;
         | decl_list function_def
         ;

这两种解决方案都有效,因为自下而上的解析器可以等到缩减生产结束后才能确定哪个是正确的缩减;直到第三个标记为止, var_decl function_def 看起来都没有关系.

Both these solutions work because a bottom-up parser can wait until the end of the production being reduced in order to decide which is the correct reduction; it does not matter that var_decl and function_def look identical until the third token.

问题实际上是很难弄清一无所有的类型.

The problem really is that it's hard to figure out the type of nothing.

这篇关于C变体语法上的Shift/Reduce冲突的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆