如何解析带括号的层次结构根? [英] How to parse a parenthesized hierarchy root?
问题描述
我正在尝试使用 ANTLR 解析值.这是我语法的相关部分:
I'm trying to parse values with ANTLR. Here's the relevant part of my grammar:
root : IDENTIFIER | SELF | literal | constructor | call | indexer;
hierarchy : root (SUB^ (IDENTIFIER | call | indexer))*;
factor : hierarchy ((MULT^ | DIV^ | MODULO^) hierarchy)*;
sum : factor ((PLUS^ | MINUS^) factor)*;
comparison : sum (comparison_operator^ sum)*;
value : comparison | '(' value ')';
我不会描述每个标记或规则,因为它们的名称已经很好地解释了它们的作用.这个语法运行良好并且可以编译,允许我使用 value
解析诸如:
I won't describe each token or rule since their name is quite explanatory of their role. This grammar works well and compiles, allowing me to parse, using value
, things such as:
a.b[c(5).d[3] * e()] < e("f")
价值识别剩下的唯一事情就是能够拥有带括号的层次结构根.例如:
The only thing left for value recognition is to be able to have parenthesized hierarchy roots. For instance:
(a.b).c
(3 < d()).e
...
天真地,没有太多期望,我尝试将以下替代方案添加到我的 root
规则中:
Naively, and without much expectations, I tried adding the following alternative to my root
rule:
root : ... | '(' value ')';
然而,由于非 LL(*)ism,这打破了 value
规则:
This however breaks the value
rule due to non-LL(*)ism:
rule value has non-LL(*) decision due to recursive rule invocations reachable
from alts 1,2. Resolve by left-factoring or using syntactic predicates or using
backtrack=true option.
即使阅读了大部分权威 ANTLR 参考,我仍然不明白这些错误.但是,我所理解的是,在看到括号开头时,ANTLR 无法知道它是在查看括号中的值的开头,还是在括号中的根的开头.
Even after reading most of The Definitive ANTLR Reference, I still don't understand these errors. However, what I do understand is that, upon seeing a parenthesis opening, ANTLR cannot know if it's looking at the beginning of a parenthesized value, or at the beginning of a parenthesized root.
如何明确定义带括号的层次结构根的行为?
How can I clearly define the behavior of parenthesized hierarchy root?
编辑:根据要求,附加规则:
Edit: As requested, the additional rules:
parameter : type IDENTIFIER -> ^(PARAMETER ^(type IDENTIFIER));
constructor : NEW type PAREN_OPEN (arguments+=value (SEPARATOR arguments+=value)*)? PAREN_CLOSE -> ^(CONSTRUCTOR type ^(ARGUMENTS $arguments*)?);
call : IDENTIFIER PAREN_OPEN (values+=value (SEPARATOR values+=value)*)? PAREN_CLOSE -> ^(CALL IDENTIFIER ^(ARGUMENTS $values*)?);
indexer : IDENTIFIER INDEX_START (values+=value (SEPARATOR values+=value)*)? INDEX_END -> ^(INDEXER IDENTIFIER ^(ARGUMENTS $values*));
推荐答案
从 value
中移除 '(' value ')'
并将其放入 root代码>:
Remove '(' value ')'
from value
and place it in root
:
root : IDENTIFIER | SELF | literal | constructor | call | indexer | '(' value ')';
...
value : comparison;
现在 (a.b).c
将导致以下解析:
Now (a.b).c
will result in the following parse:
和 (3 < d()).e
在:
当然,您可能希望省略 AST 中的括号:
Of course, you'll probably want to omit the parenthesis from the AST:
root : IDENTIFIER | SELF | literal | constructor | call | indexer | '('! value ')'!;
此外,您不需要在解析器规则中使用 +=
在 List
中附加标记.以下:
Also, you don't need to append tokens in a List
using +=
in your parser rules. The following:
call
: IDENTIFIER PAREN_OPEN (values+=value (SEPARATOR values+=value)*)? PAREN_CLOSE
-> ^(CALL IDENTIFIER ^(ARGUMENTS $values*)?)
;
可以改写为:
call
: IDENTIFIER PAREN_OPEN (value (SEPARATOR value)*)? PAREN_CLOSE
-> ^(CALL IDENTIFIER ^(ARGUMENTS value*)?)
;
编辑
您的主要问题是某些输入可以通过两种(或更多!)方式进行解析.例如,输入 (a)
可以被你的 value
规则的替代 1 和 2 解析:
EDIT
Your main problem is the fact that certain input can be parsed in two (or more!) ways. For example, the input (a)
could be parsed by alternative 1 and 2 of your value
rule:
value
: comparison // alternative 1
| '(' value ')' // alternative 2
;
运行您的解析器规则:comparison
(备选方案 1)可以匹配 (a)
因为它匹配 root
规则,在轮到它匹配'(' value ')'
.但这也是替代2匹配的内容!你有它:解析器看到"一个输入,两个不同的解析并报告这种歧义.
Run through your parser rules: a comparison
(alternative 1) can match (a)
because it matches the root
rule, which in its turn matches '(' value ')'
. But that is also what alternative 2 matches! And there you have it: the parser "sees" for one input, two different
parses and reports about this ambiguity.
这篇关于如何解析带括号的层次结构根?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!