ANTLR Lexer规则似乎仅用作解析器规则的一部分,而不是另一个lexer规则的一部分 [英] ANTLR Lexer rule only seems to work as part of parser rule, and not part of another lexer rule
问题描述
如果我具有以下语法来分析由空格分隔的整数列表:
If I have the following the grammar to parse a list of Integers separated by whitespace:
grammar TEST;
test
: expression* EOF
;
expression
: integerLiteral
;
integerLiteral
: INTLITERAL
;
PLUS: '+';
MINUS: '-';
DIGIT: '0'..'9';
DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;
WS: [ \t\r\n] -> skip;
它不起作用!如果我通过"100",我将得到:
It does not work! If I pass "100" I get:
line 1:0 extraneous input '100' expecting {<EOF>, INTLITERAL}
但是,如果删除词法分析器INTLITERAL规则并将其放在解析器规则integerLiteral之下,就像这样
However if remove the lexer INTLITERAL rule and place it just under the parser rule integerLiteral like this
integerLiteral
: (PLUS|MINUS)? DIGITS
;
现在看来一切正常!
我认为,如果我能够理解为什么会这样,那么我将开始理解我所遇到的一些特质.
I feel that if I am able to understand why this is I'll begin to understand some idiosyncrasies that I am experiencing.
推荐答案
词法分析器以以下方式创建令牌:
The lexer creates tokens in the following manner:
- 尝试为单个令牌匹配尽可能多的字符
- 如果两个令牌与相同字符匹配,则让第一个定义为获胜"
鉴于上述2条规则的信息,您将看到自己的规则:
Given the information from the 2 rules above, then you will see that your rules:
DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;
是问题所在.对于输入的 100
,将创建一个 DIGITS
令牌:规则2在此处适用:两个规则均匹配 100
,但是由于 DIGITS
在 INTLITERAL
之前定义,将创建 DIGITS
令牌.
are the problem. For the input 100
a DIGITS
token is created: rule 2 applies here: both rules match 100
, but since DIGITS
is defined before INTLITERAL
, a DIGITS
token is created.
将 INTLITERAL
移至 DIGITS
上方:
INTLITERAL: (PLUS|MINUS)? DIGITS;
DIGIT: '0'..'9';
DIGITS: DIGIT+;
但是现在请注意, DIGIT
和 DIGITS
永远不会单独成为令牌,因为 INTLITERAL
始终会首先匹配.在这种情况下,您可以将这两个规则都做成 fragment
s,然后放在哪里都没关系,因为 fragment
规则仅在其他词法分析器规则中使用(不在解析器规则中)
But now notice that DIGIT
and DIGITS
will never become tokens on their own because INTLITERAL
will always be matched first. In this case, you can make both of these rules fragment
s, and then it doesn't matter where you place them because fragment
rules are only used inside other lexer rules (not in parser rules)
制作 DIGIT
和 DIGITS
片段
fragment DIGIT: '0'..'9';
fragment DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;
解决方案3
或者更好的是,不要将运算符粘贴在 INTLITERAL
上,而是将其与一元表达式匹配:
Solution 3
Or better, don't glue the operator on the INTLITERAL
but match it in an unary expression:
expression
: (MINUS | PLUS) expression
| expression (MINUS | PLUS) expression
| integerLiteral
;
integerLiteral
: INTLITERAL
;
PLUS: '+';
MINUS: '-';
fragment DIGIT: '0'..'9';
INTLITERAL: DIGIT+;
这篇关于ANTLR Lexer规则似乎仅用作解析器规则的一部分,而不是另一个lexer规则的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!