ANTLR Lexer规则似乎仅用作解析器规则的一部分,而不是另一个lexer规则的一部分 [英] ANTLR Lexer rule only seems to work as part of parser rule, and not part of another lexer rule

查看:52
本文介绍了ANTLR Lexer规则似乎仅用作解析器规则的一部分,而不是另一个lexer规则的一部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我具有以下语法来分析由空格分隔的整数列表:

If I have the following the grammar to parse a list of Integers separated by whitespace:

grammar TEST;

test
    : expression* EOF
    ;

expression
    : integerLiteral
    ;

integerLiteral
    : INTLITERAL
    ;

PLUS: '+';
MINUS: '-';

DIGIT: '0'..'9';
DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;

WS: [ \t\r\n] -> skip;

它不起作用!如果我通过"100",我将得到:

It does not work! If I pass "100" I get:

line 1:0 extraneous input '100' expecting {<EOF>, INTLITERAL}

但是,如果删除词法分析器INTLITERAL规则并将其放在解析器规则integerLiteral之下,就像这样

However if remove the lexer INTLITERAL rule and place it just under the parser rule integerLiteral like this

integerLiteral
    : (PLUS|MINUS)? DIGITS
    ;

现在看来一切正常!

我认为,如果我能够理解为什么会这样,那么我将开始理解我所遇到的一些特质.

I feel that if I am able to understand why this is I'll begin to understand some idiosyncrasies that I am experiencing.

推荐答案

词法分析器以以下方式创建令牌:

The lexer creates tokens in the following manner:

  1. 尝试为单个令牌匹配尽可能多的字符
  2. 如果两个令牌与相同字符匹配,则让第一个定义为获胜"

鉴于上述2条规则的信息,您将看到自己的规则:

Given the information from the 2 rules above, then you will see that your rules:

DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;

是问题所在.对于输入的 100 ,将创建​​一个 DIGITS 令牌:规则2在此处适用:两个规则均匹配 100 ,但是由于 DIGITS INTLITERAL 之前定义,将创建 DIGITS 令牌.

are the problem. For the input 100 a DIGITS token is created: rule 2 applies here: both rules match 100, but since DIGITS is defined before INTLITERAL, a DIGITS token is created.

INTLITERAL 移至 DIGITS 上方:

INTLITERAL: (PLUS|MINUS)? DIGITS;
DIGIT: '0'..'9';
DIGITS: DIGIT+;

但是现在请注意, DIGIT DIGITS 永远不会单独成为令牌,因为 INTLITERAL 始终会首先匹配.在这种情况下,您可以将这两个规则都做成 fragment s,然后放在哪里都没关系,因为 fragment 规则仅在其他词法分析器规则中使用(不在解析器规则中)

But now notice that DIGIT and DIGITS will never become tokens on their own because INTLITERAL will always be matched first. In this case, you can make both of these rules fragments, and then it doesn't matter where you place them because fragment rules are only used inside other lexer rules (not in parser rules)

制作 DIGIT DIGITS 片段

fragment DIGIT: '0'..'9';
fragment DIGITS: DIGIT+;
INTLITERAL: (PLUS|MINUS)? DIGITS;

解决方案3

或者更好的是,不要将运算符粘贴在 INTLITERAL 上,而是将其与一元表达式匹配:

Solution 3

Or better, don't glue the operator on the INTLITERAL but match it in an unary expression:

expression
    : (MINUS | PLUS) expression
    | expression (MINUS | PLUS) expression
    | integerLiteral
    ;

integerLiteral
    : INTLITERAL
    ;

PLUS: '+';
MINUS: '-';

fragment DIGIT: '0'..'9';

INTLITERAL: DIGIT+;

这篇关于ANTLR Lexer规则似乎仅用作解析器规则的一部分,而不是另一个lexer规则的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆