使用 yacc 解析表达式序列 [英] Parsing a sequence of expressions using yacc

查看:49
本文介绍了使用 yacc 解析表达式序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析没有分隔符的表达式序列,以便能够解析 ML/F# 样式的函数调用:

I am trying to parse a sequence of expressions without delimiters in order to be able to parse ML/F# style function invocations:

myfunc expr1 expr2 expr3

然而,表达式序列给了我一个移位/减少冲突的列表.

However, the sequence of expressions is giving me a list of shift/reduce conflicts.

我的猜测是冲突是由我的语法的递归性质引起的,但我不知道如何解决这些冲突.

My guess is that the conflicts are caused by the recursive nature of my grammar, but I don't know how to fix these conflicts.

我的(简化的)优先规则和语法如下所示:

My (simplified) precedence rules and grammar looks like this:

/* Lowest precedence */
%left PLUS
%left TIMES
%left LPAR
/* Highest precedence */

Expr: 
  | CSTINT                                { CstI $1                   }
  | LPAR Expr RPAR                        { $2                        }
  | Expr TIMES Expr                       { Prim("*", $1, $3)         }
  | Expr PLUS Expr                        { Prim("+", $1, $3)         }
  | NAME ExprList                         { Call(Var $1, $2)          }

ExprList:
  |                                      { []                        }
  | Expr ExprList                        { $1::$2                    }

当我将它传递给 fsyacc 时,我会得到一个 shift/reduce 和 reduce/reduce 冲突的列表.一个示例移位/减少冲突是

When I pass this to fsyacc I get a list of shift/reduce and reduce/reduce conflicts. An example shift/reduce conflict is

state 11: shift/reduce error on PLUS

状态 11 的 fsyacc 输出为:

The output from fsyacc for state 11 is:

state 11:
  items:
    Expr -> Expr . 'TIMES' Expr
    Expr -> Expr . 'PLUS' Expr
    ExprList -> Expr . ExprList

  actions:
    action 'EOF' (noprec):   reduce ExprList --> 
    action 'LPAR' (explicit left 10000):   shift 6
    action 'RPAR' (noprec):   reduce ExprList --> 
    action 'COMMA' (noprec):   reduce ExprList --> 
    action 'PLUS' (explicit left 9998):   shift 13
    action 'TIMES' (explicit left 9999):   shift 12
    action 'NAME' (noprec):   shift 14
    action 'CSTINT' (noprec):   shift 5
    action 'error' (noprec):   reduce ExprList --> 
    action '#' (noprec):   reduce ExprList --> 
    action '$$' (noprec):   reduce ExprList --> 

  immediate action: <none>
 gotos:
    goto Expr: 11
    goto ExprList: 16

我已经有一段时间没有参加编译器理论课程了,所以虽然我知道什么是 shift/reduce 和 reduce/reduce 冲突,但我不习惯考虑它们.特别是,我没有看到减少 PLUS 会如何导致有效的解析.总而言之,如果您对以下一个或多个问题有任何见解,我们将不胜感激:

It has been a while since I took a course in compiler theory, so while I kind of know what shift/reduce and reduce/reduce conflicts are, I am not used to thinking about them. In particular, I don't see how reducing on PLUS could lead to a valid parse. All in all, any insight into one or more of the following questions would be highly appreciated:

  1. 为什么我的语法似乎有歧义?
  2. 我可以使用优先级和/或关联性规则修复它吗,或者,如果没有,
  3. 我是否需要重写语法,如果需要,大致如何做?
  4. yacc 是否适合用于此类构造的工具?

推荐答案

1.为什么我的语法看起来有歧义?

你的语法有歧义.这不是错觉.

Your grammar is ambiguous. It's not an illusion.

假设 f 是一个函数.

Suppose f is a function.

f  x + 7

f(x) + 7 还是 f(x+7)?你的语法产生两者.

Is that f(x) + 7 or f(x+7)?. Your grammar produces both.

IIRC,函数应用绑定非常紧密,关联到左边.所以上面的表达式应该解析为 f(x) + 7.

IIRC, function application binds very tightly and associates to the left. So the above expression should parse as f(x) + 7.

2.我可以使用优先级和/或关联性规则来修复它吗?如果不是,

您可以使用优先级和关联性规则消除函数应用程序的歧义;你只需要用 %prec 为它声明一个优先级.然而,它最终看起来有点难看......

You can disambiguate function application with precedence and associativity rules; you just need to declare a precedence for it with %prec. However, it ends up looking a bit ugly and ...

3.是否需要重写语法,如果需要,大致怎么做?

...我认为将函数应用程序表示为 Name ExprList 是不正确的.如果您一次对一个参数进行柯里化会更简洁,至少在构建 AST 时是这样,如果您在语法中而不是使用优先级规则来执行它看起来更漂亮,因为优先级规则实际上不是为隐形运算符设计的.见下文.

... I don't believe it is correct to represent function application as Name ExprList. It's a lot cleaner if you curry the arguments one at a time, at least while building the AST, and it looks prettier if you do it in the grammar rather than with precedence rules, which really weren't designed for invisible operators. See below.

4.yacc 是否适合用于此类构造的工具?

当然,为什么不呢?

这里有两个有效的(据我所知)yacc 语法.第一个对所有内容使用优先声明;第二个分离出函数应用程序,我认为它更清晰:

Here are two working (as far as I know) yacc grammars. The first uses precedence declarations for everything; the second separates out function application, which I think is cleaner:

// grammar1.y:

%left '+'
%left '*'
%left ATOM ';' '(' ')'

%%

program: /* empty */           { $$ = ""; }
       | program statement ';' { std::cout << $2 << std::endl; }
       | program ';'
       ;

statement: expr
         ;

expr: ATOM
    | '(' expr ')'             { $$ = $2; }
    | expr expr %prec ATOM     { $$ = '(' + $1 + ' ' + $2 + ')'; }
    | expr '+' expr            { $$ = "(+ " + $1 + ' ' + $3 + ')'; }
    | expr '*' expr            { $$ = "(* " + $1 + ' ' + $3 + ')'; }
    ;

<小时>

// grammar2.y

%token ATOM

%left '+'
%left '*'

%%

program: /* empty */           { $$ = ""; }
       | program statement ';' { std::cout << $2 << std::endl; }
       | program ';'
       ;

statement: expr
         ;

term : ATOM
     | '(' expr ')'             { $$ = $2; }
     ;

apply: term
     | apply term              { $$ = '(' + $1 + ' ' + $2 + ')'; }
     ;

expr : apply
     | expr '+' expr            { $$ = "(+ " + $1 + ' ' + $3 + ')'; }
     | expr '*' expr            { $$ = "(* " + $1 + ' ' + $3 + ')'; }
     ;

这篇关于使用 yacc 解析表达式序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆