去除 ANTLR 中的左递归 [英] Removing Left Recursion in ANTLR

查看:26
本文介绍了去除 ANTLR 中的左递归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

删除左递归中所述,有两种方法可以删除左递归.

  • 修改原始语法以使用某些程序删除左递归
  • 写文法本来就没有左递归

人们通常使用什么来删除(没有)ANTLR 的左递归?我已经将 flex/bison 用于解析器,但我需要使用 ANTLR.我唯一关心的是使用 ANTLR(或一般的 LL 解析器)是去除左递归.

  • 实际上,在 ANTLR 中删除左递归有多严重?这是使用 ANTLR 的阻碍吗?或者,在 ANTLR 社区中没有人关心它?
  • 我喜欢 AST 生成 ANTLR 的想法.就快速简便地获得 AST 而言,哪种方法(在 2 种去除左递归方法中)更可取?

已添加

我对以下语法做了一些实验.

<前>E -> E + T|TT -> T * F|FF -> INT |( )

左递归删除后,我得到以下一个

<前>E -> TE'E' -> 空 |+ TE'T -> FT'T' -> 空 |* FT'

我可以想出以下 ANTLR 表示.尽管如此,它相对简单明了,但似乎没有左递归的语法应该是更好的方法.

<前>语法 T;选项 {语言=Python;}开始返回 [值]: e {$value = $e.value};e 返回 [值]: t{$value = $t.value如果 $ep.value != 无:$value += $ep.value};ep 返回 [值]:{$value = 无}|'+' t r = ep{$value = $t.value如果 $r.value != 无:$value += $r.value};t 返回 [值]: ftp{$value = $f.value如果 $tp.value != 无:$value *= $tp.value};tp 返回 [值]: {$value = 无}|'*' f r = tp{$value = $f.value;如果 $r.value != 无:$value *= $r.value};f 返回 [int 值]: INT {$value = int($INT.text)}|'(' e ')' {$value = $e.value};INT:'0'..'9'+;WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;

解决方案

考虑类似典型参数列表的内容:

parameter_list:参数|parameter_list ',' 参数;

由于您不关心优先级或与参数的关联性之类的任何事情,因此很容易将其转换为右递归,代价是增加了额外的产生式:

parameter_list: 参数 more_params;更多参数:|',' 参数 more_params;

对于最严重的情况,您可能需要在龙之书上花一些时间.快速检查一下,这主要在第 4 章中介绍.

就严肃性而言,我很确定 ANTLR 根本不会接受包含左递归的语法,这会将其归入绝对必要性"类别.

As is explained in Removing left recursion , there are two ways to remove the left recursion.

  • Modify the original grammar to remove the left recursion using some procedure
  • Write the grammar originally not to have the left recursion

What people normally use for removing (not having) the left recursion with ANTLR? I've used flex/bison for parser, but I need to use ANTLR. The only thing I'm concerned about using ANTLR (or LL parser in genearal) is left recursion removal.

  • In practical sense, how serious of removing left recursion in ANTLR? Is this a showstopper in using ANTLR? Or, nobody cares about it in ANTLR community?
  • I like the idea of AST generation of ANTLR. In terms of getting AST quick and easy way, which method (out of the 2 removing left recursion methods) is preferable?

Added

I did some experiment with the following grammar.

E -> E + T|T
T -> T * F|F
F -> INT | ( E )

After left recursion removal, I get the following one

E -> TE'
E' -> null | + TE'
T -> FT'
T' -> null | * FT'

I could come up with the following ANTLR representation. Even though, It's relatively pretty simple and straightforward, it seems the grammar that doesn't have the left recursion should be the better way to go.

grammar T;

options {
    language=Python;
}

start returns [value]
   : e {$value = $e.value};
e returns [value]
   : t ep  
     {
       $value = $t.value
       if $ep.value != None:
         $value += $ep.value
     }
   ;
ep returns [value]
   : {$value = None}
   | '+' t r = ep 
     {
       $value = $t.value
       if $r.value != None:
            $value += $r.value
     }
   ;
t returns [value]
  : f tp 
    {
      $value = $f.value
      if $tp.value != None:
        $value *= $tp.value
    }
  ;
tp returns [value]
  : {$value = None}
  | '*' f r = tp 
    {
      $value = $f.value;
      if $r.value != None:
        $value *= $r.value
    }
  ;
f returns [int value]
  : INT {$value = int($INT.text)}
  | '(' e ')' {$value = $e.value}
  ;

INT :   '0'..'9'+ ;
WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;

解决方案

Consider something like a typical parameter list:

parameter_list: parameter
              | parameter_list ',' parameter
              ;

Since you don't care about anything like precedence or associativity with parameters, this is fairly easy to convert to right recursion, at the expense of adding an extra production:

parameter_list: parameter more_params
              ;

more_params:
           | ',' parameter more_params
           ;

For the most serious cases, you might want to spend some time in the Dragon Book. Doing a quick check, this is covered primarily in chapter 4.

As far as seriousness goes, I'm pretty sure ANTLR simply won't accept a grammar that contains left recursion, which would put it into the "absolute necessity" category.

这篇关于去除 ANTLR 中的左递归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆