ANTLR4解析时互左递归错误 [英] ANTLR4 mutually left-recursive error when parsing
问题描述
我有这个 ANTLR 4 语法:
I have this ANTLR 4 grammar:
constantFixedExpresion : term (('+'|'-') term)+;
term : factor (('*'|'//'|'REM')factor)+;
factor : ('+'|'-')*
( wholeNumberConstant
| constantFixedExpresion
| 'TOFIXED' (stringConstant | bitCodeConstant)
| identifier)
('FIT'constantFixedExpresion)*;
我收到以下错误:
error(119): LanguageA.g4::: 以下规则集是相互左递归的 [constantFixedExpresion, factor, term]
error(119): LanguageA.g4::: The following sets of rules are mutually left-recursive [constantFixedExpresion, factor, term]
我尝试了很多方法,但无法解决.是什么问题,我该如何解决?
I tried so many ways but can't fix it. What is the problem and how can I solve it?
推荐答案
Antlr 是一个 LL(*) 解析器,它在很多方面都比 LL(k) 解析器,但仍有许多缺点.其中之一是它无法处理左递归(实际上,版本 4 可以在同一规则内处理左递归).错误是说你有一个语法的左递归,这是 LL 解析器的祸根.
Antlr is a LL(*) parser, which is in many ways "better" than a LL(k) parser, but still has many of its disavantages. One of these being the fact it can't deal with left-recursion (in fact, the version 4 can deal with left-recursion within the same rule). What the error is saying is that you have a left-recursion of a grammar, a bane for the LL parsers.
这是由语法中的这种结构引起的:
This is caused by this construction in your grammar:
constantFixedExpression: term ...;
term: factor ...;
factor: ('+' | '-')* (constantFixedExpression | ...) ...;
由于*
操作符表示0或更多,我可以用0实例化它,所以解析器会这样做:尝试constantFixedExpression
,所以它需要尝试term
,所以它需要尝试 factor
,所以它需要尝试 constantFixedEXpression
,所以它 [...]无限循环.
Since the *
operator means 0 or more, I can instantiate it with 0, so the parser will do this: "try constantFixedExpression
, so it needs to try term
, so it needs to try factor
, so it needs to try constantFixedEXpression
, so it [...]" and you've got yourself an infinite loop.
幸运的是,上下文无关的形式语法有一个等效的转换来消除左递归!它可以一般地表示为:
Fortunately, context-free formal grammars have an equivalent transformation for removing left-recursion! It can be expressed generically by:
A -> Aa | b
-- becomes --
A -> bR
R -> aR | ε
或者用 Antlr 表示法:
Or in Antlr notation:
A: Aa | b;
// becomes
A: bR;
R: (aR)?;
有关此过程的更多信息可以在自动机/语法书籍或维基百科中找到.
More information about this process can be found in automaton/grammars books or in the Wikipedia.
我将通过重构来纠正您的语法,以消除左递归作为您的工作.不过,我想再谈一点:Antlr 4 可以做左递归!正如我提到的,版本 4 可以在相同的规则内处理左递归.在 Antlr4 中,除了直接在解析中指示运算符的优先级和关联性之外,还有其他方法可以指示运算符的优先级和关联性.让我们看看它是如何工作的:
I'll leave correcting your grammar with the refactoration to remove left-recursion as your work. However, I want to touch in another point: Antlr 4 can do left-recursion! As I mentioned, the version 4 can deal with left-recursion within the same rule. There are ways to indicate precedence and associativity of operators other than directly in parsing, as you're doing, in Antlr4. Let's see how it works:
expr: NUMBER
|<assoc=right> expr '^' expr
| expr '*' expr
| expr '/' expr
| expr '+' expr
| expr '-' expr;
这是一个基本的计算器语法示例.顶部的运算符优先级最高,底部的运算符优先级较低.这意味着 2+2*3
将被解析为 2+(2*3)
而不是 (2+2)*3
.<assoc=right>
结构意味着运算符处于右结合,所以 1^2^3
将被解析为 1^(2^3)
而不是 (1^2)^3
.
This is an example of a basic calculator grammar. The operators at the top are those with highest precedence, and those at the bottom are of lower precedence. This means 2+2*3
will be parsed as 2+(2*3)
rather than (2+2)*3
. The <assoc=right>
construction means the operator in right-associative, so 1^2^3
will be parsed as 1^(2^3)
rather than (1^2)^3
.
如您所见,使用左递归指定运算符要容易得多,因此 Antlr 4 在这些时刻有很大帮助!我建议您重新编写语法以利用此功能.
As you can see, it is much easier to specify operators with left-recursion, so Antlr 4 is of big help in these moments! I recommend re-writing your grammar to make use of this feature.
这篇关于ANTLR4解析时互左递归错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!