帮助左分解语法以消除左递归 [英] Help with left factoring a grammar to remove left recursion

查看:135
本文介绍了帮助左分解语法以消除左递归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一种小型的自定义脚本语言,我正在尝试对其进行更新,以允许使用布尔表达式,例如a > 2a > 2 and (b < 3 or c > 5).这是我在这里遇到麻烦的括号表达式.

I have a small custom scripting language, and I am trying to update it to allow boolean expressions such as a > 2 and a > 2 and (b < 3 or c > 5). It's the parenthetical expressions that I am having trouble with here.

这里是一个完整的语法(根据@Bart Kiers的回答根据原始帖子进行编辑)来显示问题.这是我实际语法的精简版,但是问题也在这里发生.

Here is a (edited since the original post based on the answer from @Bart Kiers) full grammar that exhibits the problem. This is a pared-down version of my actual grammar, but the problem occurs here too.

grammar test;


options {
    language = 'JavaScript'; 
    output = AST;
} 


statement 
    :   value_assignment_statement  
        EOF
    ;


value_assignment_statement 
    :   IDENT
        '='
        expression                      
    ;

value_expression 
    :   value_list_expression           
    |   IDENT                           
    ;


value_list_expression 
    :   value_enumerated_list       
    ;


value_enumerated_list : '{' unary+ '}'
    ;



term 
    :   LPAREN expression RPAREN        
    |   INTEGER                         
    |   value_expression                
    ;

unary : ( '+' | '-' )* term
    ;

mult :  unary ( ('*' | '/') unary)*
    ;

expression : mult ( ('+' | '-') mult )*
    ;


boolean 
    :   boolean_expression
        EOF
    ;

boolean_expression
    :   boolean_or_expression
    ;

boolean_or_expression 
    :   boolean_and_expression (OR boolean_and_expression)*
    ;

boolean_and_expression 
    :   boolean_rel_expression (AND boolean_rel_expression)*
    ;

boolean_rel_expression
    :   boolean_neg_expression relational_operator boolean_neg_expression
    ;

boolean_neg_expression 
    :   (NOT)? atom
    ;

atom
    :   LPAREN boolean_expression RPAREN
    //| expression
    ;


relational_operator : '=' | '>' | '<';


LPAREN      :   '(';
RPAREN      :   ')';
AND         :   'and';
OR          :   'or';
NOT         :   'not';
IDENT       :   LETTER LETTER+;
INTEGER     :   DIGIT+;
WS          :   (' ' | '\n' | '\r' | '\t')+     { $channel = HIDDEN; };

fragment DIGIT      : '0'..'9';
fragment LETTER     : ('a'..'z' | 'A'..'Z');

我尝试容纳带括号的布尔表达式(例如a > 2 or (b < 3))在atom规则的注释行中.当我取消注释此行并将其包括在语法中时,ANTLR会给我这个错误:

My attempt to accommodate parenthetical boolean expressions such as a > 2 or (b < 3) is in the commented-out line in the atom rule. When I uncomment this line and include it in the grammar, ANTLR gives me this error:

[致命]规则原子具有非LL(*)决策,这是由于从alt 1,2可以到达的递归规则调用.通过左分解或使用语法谓词或使用backtrack = true选项来解决.

[fatal] rule atom has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.

我想通过删除递归来解决此问题,但是我似乎无法从维基百科有关如何删除我自己的东西的左递归的描述.

I would like to address this by removing the recursion, but I can't seem to make the transition from the Wikipedia description on how to remove left recursion to my own stuff.

在使用此语法时,有时我想将statement用作带有abc = 2 + 3之类的输入的根,该输入将值赋给名为abc的变量.其他时候,我想使用语法以boolean作为输入(例如abc > 3 and (xyz < 5 or xyz > 10))的根来求值表达式.当我尝试使用@Bart的答案作为模型时,它工作得很好,直到我尝试将statement使用的语法部分与boolean使用的部分合并.他们都应该都可以使用expression,但是这就是我遇到这个左递归错误的地方.

In using this grammar, I want sometimes to use statement as a root with input such as abc = 2 + 3, which assigns a value to a variable named abc. Other times I want to use the grammar to evaluate an expression with boolean as the root with input such as abc > 3 and (xyz < 5 or xyz > 10). When I tried to use @Bart's answer as a model, it worked fine until I tried to merge the parts of the grammar used by statement with the parts used by boolean. They should both be able to use an expression, but that's where I'm stuck with this left recursion error.

那么,如何既可以处理括号又可以避免左递归问题?

So, how can I both handle parentheses and avoid the left recursion problem?

推荐答案

布尔表达式与加法和乘法表达式相同,因此不应与它们分开.这是说明所有类型的表达式的方法:

Boolean expressions are just the same as the additive- and multiplicative expression, and should therefore not be separated from them. Here's how to account for all types of expressions:

grammar test;

parse
  :  expression EOF
  ;

expression 
  :  or
  ;

or
  :  and (OR and)*
  ;

and
  :  rel (AND rel)*
  ;

rel
  :  add (('=' | '>' | '<') add)*
  ;

add
  :  mult (('+' | '-') mult)*
  ;

mult
  :  unary (('*' | '/') unary)*
  ;

unary 
  :  '-' term
  |  '+' term
  |  NOT term
  |  term
  ;

term 
  :  INTEGER  
  |  IDENT       
  |  list
  |  '(' expression ')'
  ;

list 
  :  '{' (expression (',' expression)*)? '}'
  ;

AND     :  'and';
OR      :  'or';
NOT     :  'not';
IDENT   :  LETTER LETTER*;
INTEGER :  DIGIT+;
WS      :  (' ' | '\n' | '\r' | '\t')+  { $channel = HIDDEN; };

fragment DIGIT   : '0'..'9';
fragment LETTER  : ('a'..'z' | 'A'..'Z');

它将解析示例输入:

abc > 3 and (xyz < 5 or xyz > {1, 2, 3})

进入以下解析树:

这篇关于帮助左分解语法以消除左递归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆