解析时ANTLR4相互左递归错误 [英] ANTLR4 mutually left-recursive error when parsing

查看:82
本文介绍了解析时ANTLR4相互左递归错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个ANTLR 4语法:

I have this ANTLR 4 grammar:

constantFixedExpresion : term (('+'|'-') term)+;

term : factor (('*'|'//'|'REM')factor)+;

factor : ('+'|'-')*
           ( wholeNumberConstant
           | constantFixedExpresion
           | 'TOFIXED' (stringConstant | bitCodeConstant)      
           | identifier)
         ('FIT'constantFixedExpresion)*;

我收到以下错误:

错误(119):语言A.g4 :::以下规则集相互左递归[constantFixedExpresion,因子,项]

error(119): LanguageA.g4::: The following sets of rules are mutually left-recursive [constantFixedExpresion, factor, term]

我尝试了很多方法,但是无法解决.有什么问题,我该如何解决?

I tried so many ways but can't fix it. What is the problem and how can I solve it?

推荐答案

Antlr是LL(*)解析器,在许多方面都比

Antlr is a LL(*) parser, which is in many ways "better" than a LL(k) parser, but still has many of its disavantages. One of these being the fact it can't deal with left-recursion (in fact, the version 4 can deal with left-recursion within the same rule). What the error is saying is that you have a left-recursion of a grammar, a bane for the LL parsers.

这是由您的语法中的这种构造引起的:

This is caused by this construction in your grammar:

constantFixedExpression: term ...;
term: factor ...;
factor: ('+' | '-')* (constantFixedExpression | ...) ...;

由于 * 运算符表示0或更大,因此我可以将其实例化为0,因此解析器将执行以下操作:尝试 constantFixedExpression ,因此它需要尝试 term ,因此它需要尝试 factor ,因此需要尝试 constantFixedEXpression ,这样[...],您便拥有了自己无限循环.

Since the * operator means 0 or more, I can instantiate it with 0, so the parser will do this: "try constantFixedExpression, so it needs to try term, so it needs to try factor, so it needs to try constantFixedEXpression, so it [...]" and you've got yourself an infinite loop.

幸运的是,无上下文形式语法具有等效的转换,可以消除左递归!可以通过以下方式概括地表示:

Fortunately, context-free formal grammars have an equivalent transformation for removing left-recursion! It can be expressed generically by:

A -> Aa | b
-- becomes --
A -> bR
R -> aR | ε

或使用Antlr表示法:

Or in Antlr notation:

A: Aa | b;
// becomes
A: bR;
R: (aR)?;

有关此过程的更多信息,请参见自动机/语法书籍或 Wikipedia .

More information about this process can be found in automaton/grammars books or in the Wikipedia.

我将通过重构来纠正您的语法,以删除左递归作为您的工作.但是,我想谈谈另一点:Antlr 4可以进行左递归!如前所述,版本4可以在同一规则内处理左递归.除了直接在Antlr4中进行解析时,还有其他方法可以指示运算符的优先级和关联性.让我们看看它是如何工作的:

I'll leave correcting your grammar with the refactoration to remove left-recursion as your work. However, I want to touch in another point: Antlr 4 can do left-recursion! As I mentioned, the version 4 can deal with left-recursion within the same rule. There are ways to indicate precedence and associativity of operators other than directly in parsing, as you're doing, in Antlr4. Let's see how it works:

expr: NUMBER
      |<assoc=right> expr '^' expr
      | expr '*' expr
      | expr '/' expr
      | expr '+' expr
      | expr '-' expr;

这是基本计算器语法的示例.顶部的运算符是优先级最高的运算符,而底部的运算符则优先级较低.这意味着 2 + 2 * 3 将被解析为 2+(2 * 3),而不是(2 + 2)* 3 .< assoc = right> 的构造意味着运算符为右关联,因此 1 ^ 2 ^ 3 将被解析为 1 ^(2 ^ 3)而不是(1 ^ 2)^ 3 .

This is an example of a basic calculator grammar. The operators at the top are those with highest precedence, and those at the bottom are of lower precedence. This means 2+2*3 will be parsed as 2+(2*3) rather than (2+2)*3. The <assoc=right> construction means the operator in right-associative, so 1^2^3 will be parsed as 1^(2^3) rather than (1^2)^3.

如您所见,使用左递归指定运算符要容易得多,因此,Antlr 4在这些时候起了很大的帮助!我建议重新编写语法以使用此功能.

As you can see, it is much easier to specify operators with left-recursion, so Antlr 4 is of big help in these moments! I recommend re-writing your grammar to make use of this feature.

这篇关于解析时ANTLR4相互左递归错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆