在ANTLR中删除左递归 [英] Removing Left Recursion in ANTLR

查看:130
本文介绍了在ANTLR中删除左递归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

删除左递归中所述,有两种方法可以删除左递归./p>

  • 使用某些过程修改原始语法以消除左递归
  • 写本来没有左递归的语法

人们通常使用ANTLR去除(不具有)左递归的方式?我已经将flex/bison用于解析器,但是我需要使用ANTLR.我唯一关心的是使用ANTLR(或一般情况下的LL解析器)是左递归删除.

  • 从实际意义上讲,在ANTLR中消除左递归的严重性如何?这是使用ANTLR的热门节目吗?还是在ANTLR社区中没有人关心它?
  • 我喜欢ANTLR的AST一代的想法.就快速便捷地获取AST而言,哪种方法(从2种删除左递归方法中脱颖而出)更可取?

已添加

我对以下语法做了一些实验.

E -> E + T|T
T -> T * F|F
F -> INT | ( E )

移除左递归后,我得到了以下一个

E -> TE'
E' -> null | + TE'
T -> FT'
T' -> null | * FT'

我可以提出以下ANTLR表示形式.即使这样,它相当简单明了,似乎没有左递归的语法应该是更好的方法.

grammar T;

options {
    language=Python;
}

start returns [value]
   : e {$value = $e.value};
e returns [value]
   : t ep  
     {
       $value = $t.value
       if $ep.value != None:
         $value += $ep.value
     }
   ;
ep returns [value]
   : {$value = None}
   | '+' t r = ep 
     {
       $value = $t.value
       if $r.value != None:
            $value += $r.value
     }
   ;
t returns [value]
  : f tp 
    {
      $value = $f.value
      if $tp.value != None:
        $value *= $tp.value
    }
  ;
tp returns [value]
  : {$value = None}
  | '*' f r = tp 
    {
      $value = $f.value;
      if $r.value != None:
        $value *= $r.value
    }
  ;
f returns [int value]
  : INT {$value = int($INT.text)}
  | '(' e ')' {$value = $e.value}
  ;

INT :   '0'..'9'+ ;
WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;

解决方案

考虑类似典型参数列表的内容:

parameter_list: parameter
              | parameter_list ',' parameter
              ;

由于您不关心优先级或与参数的关联性,因此转换为正确的递归相当容易,但会增加额外的产量:

parameter_list: parameter more_params
              ;

more_params:
           | ',' parameter more_params
           ;

在最严重的情况下,您可能需要花一些时间在《龙书》上.快速检查,这主要在第4章中介绍.

就严重性而言,我很确定ANTLR根本不会接受包含左递归的语法,这会将其归为绝对必要"类别.

As is explained in Removing left recursion , there are two ways to remove the left recursion.

  • Modify the original grammar to remove the left recursion using some procedure
  • Write the grammar originally not to have the left recursion

What people normally use for removing (not having) the left recursion with ANTLR? I've used flex/bison for parser, but I need to use ANTLR. The only thing I'm concerned about using ANTLR (or LL parser in genearal) is left recursion removal.

  • In practical sense, how serious of removing left recursion in ANTLR? Is this a showstopper in using ANTLR? Or, nobody cares about it in ANTLR community?
  • I like the idea of AST generation of ANTLR. In terms of getting AST quick and easy way, which method (out of the 2 removing left recursion methods) is preferable?

Added

I did some experiment with the following grammar.

E -> E + T|T
T -> T * F|F
F -> INT | ( E )

After left recursion removal, I get the following one

E -> TE'
E' -> null | + TE'
T -> FT'
T' -> null | * FT'

I could come up with the following ANTLR representation. Even though, It's relatively pretty simple and straightforward, it seems the grammar that doesn't have the left recursion should be the better way to go.

grammar T;

options {
    language=Python;
}

start returns [value]
   : e {$value = $e.value};
e returns [value]
   : t ep  
     {
       $value = $t.value
       if $ep.value != None:
         $value += $ep.value
     }
   ;
ep returns [value]
   : {$value = None}
   | '+' t r = ep 
     {
       $value = $t.value
       if $r.value != None:
            $value += $r.value
     }
   ;
t returns [value]
  : f tp 
    {
      $value = $f.value
      if $tp.value != None:
        $value *= $tp.value
    }
  ;
tp returns [value]
  : {$value = None}
  | '*' f r = tp 
    {
      $value = $f.value;
      if $r.value != None:
        $value *= $r.value
    }
  ;
f returns [int value]
  : INT {$value = int($INT.text)}
  | '(' e ')' {$value = $e.value}
  ;

INT :   '0'..'9'+ ;
WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;

解决方案

Consider something like a typical parameter list:

parameter_list: parameter
              | parameter_list ',' parameter
              ;

Since you don't care about anything like precedence or associativity with parameters, this is fairly easy to convert to right recursion, at the expense of adding an extra production:

parameter_list: parameter more_params
              ;

more_params:
           | ',' parameter more_params
           ;

For the most serious cases, you might want to spend some time in the Dragon Book. Doing a quick check, this is covered primarily in chapter 4.

As far as seriousness goes, I'm pretty sure ANTLR simply won't accept a grammar that contains left recursion, which would put it into the "absolute necessity" category.

这篇关于在ANTLR中删除左递归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆