如何调试ANTLR4语法无关/不匹配的输入错误 [英] How to debug ANTLR4 grammar extraneous / mismatched input error

查看：28 发布时间：2021/11/11 4:09:08 parsing antlr4

本文介绍了如何调试ANTLR4语法无关/不匹配的输入错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想解析规则手册demo.rb"文件，如下所示:

I want to parse the Rulebook "demo.rb" files like below:

rulebook Titanic-Normalization {
  version 1

  meta {
    description "Test"
    source "my-rules.xslx"
    user "joltie"
  }

  rule remove-first-line {
    description "Removes first line when offset is zero"
    when(present(offset) && offset == 0) then {
      filter-row-if-true true;
    }
  }
}

我编写了如下所示的 ANTLR4 语法文件 Rulebook.g4.目前，它可以很好地解析*.rb文件，但遇到表达式"/语句"规则时会抛出意外错误.

I wrote the ANTLR4 grammar file Rulebook.g4 like below. For now, it can parse the *.rb files generally well, but it throw unexpected error when encounter the "expression" / "statement" rules.

grammar Rulebook;

rulebookStatement
    :   KWRulebook
        (GeneralIdentifier | Identifier)
        '{'
        KWVersion
        VersionConstant
        metaStatement
        (ruleStatement)+
        '}'
    ;

metaStatement
    :   KWMeta
        '{'
        KWDescription
        StringLiteral
        KWSource
        StringLiteral
        KWUser
        StringLiteral
        '}'
    ;

ruleStatement
    :   KWRule
        (GeneralIdentifier | Identifier)
        '{'
        KWDescription
        StringLiteral
        whenThenStatement
        '}'
    ;

whenThenStatement
    :   KWWhen '(' expression ')'
        KWThen '{' statement '}'
    ;

primaryExpression
    :   GeneralIdentifier
    |   Identifier
    |   StringLiteral+
    |   '(' expression ')'
    ;

postfixExpression
    :   primaryExpression
    |   postfixExpression '[' expression ']'
    |   postfixExpression '(' argumentExpressionList? ')'
    |   postfixExpression '.' Identifier
    |   postfixExpression '->' Identifier
    |   postfixExpression '++'
    |   postfixExpression '--'
    ;

argumentExpressionList
    :   assignmentExpression
    |   argumentExpressionList ',' assignmentExpression
    ;

unaryExpression
    :   postfixExpression
    |   '++' unaryExpression
    |   '--' unaryExpression
    |   unaryOperator castExpression
    ;

unaryOperator
    :   '&' | '*' | '+' | '-' | '~' | '!'
    ;

castExpression
    :   unaryExpression
    |   DigitSequence // for
    ;

multiplicativeExpression
    :   castExpression
    |   multiplicativeExpression '*' castExpression
    |   multiplicativeExpression '/' castExpression
    |   multiplicativeExpression '%' castExpression
    ;

additiveExpression
    :   multiplicativeExpression
    |   additiveExpression '+' multiplicativeExpression
    |   additiveExpression '-' multiplicativeExpression
    ;

shiftExpression
    :   additiveExpression
    |   shiftExpression '<<' additiveExpression
    |   shiftExpression '>>' additiveExpression
    ;

relationalExpression
    :   shiftExpression
    |   relationalExpression '<' shiftExpression
    |   relationalExpression '>' shiftExpression
    |   relationalExpression '<=' shiftExpression
    |   relationalExpression '>=' shiftExpression
    ;

equalityExpression
    :   relationalExpression
    |   equalityExpression '==' relationalExpression
    |   equalityExpression '!=' relationalExpression
    ;

andExpression
    :   equalityExpression
    |   andExpression '&' equalityExpression
    ;

exclusiveOrExpression
    :   andExpression
    |   exclusiveOrExpression '^' andExpression
    ;

inclusiveOrExpression
    :   exclusiveOrExpression
    |   inclusiveOrExpression '|' exclusiveOrExpression
    ;

logicalAndExpression
    :   inclusiveOrExpression
    |   logicalAndExpression '&&' inclusiveOrExpression
    ;

logicalOrExpression
    :   logicalAndExpression
    |   logicalOrExpression '||' logicalAndExpression
    ;

conditionalExpression
    :   logicalOrExpression ('?' expression ':' conditionalExpression)?
    ;

assignmentExpression
    :   conditionalExpression
    |   unaryExpression assignmentOperator assignmentExpression
    |   DigitSequence // for
    ;

assignmentOperator
    :   '=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|='
    ;

expression
    :   assignmentExpression
    |   expression ',' assignmentExpression
    ;

statement
    :   expressionStatement
    ;

expressionStatement
    :   expression+ ';'
    ;


KWRulebook: 'rulebook';
KWVersion: 'version';
KWMeta: 'meta';
KWDescription: 'description';
KWSource: 'source';
KWUser: 'user';
KWRule: 'rule';
KWWhen: 'when';
KWThen: 'then';
KWTrue: 'true';
KWFalse: 'false';

fragment
LeftParen : '(';

fragment
RightParen : ')';

fragment
LeftBracket : '[';

fragment
RightBracket : ']';

fragment
LeftBrace : '{';

fragment
RightBrace : '}';


Identifier
    :   IdentifierNondigit
        (   IdentifierNondigit
        |   Digit
        )*
    ;

GeneralIdentifier
    :   Identifier
        ('-' Identifier)+
    ;

fragment
IdentifierNondigit
    :   Nondigit
    //|   // other implementation-defined characters...
    ;

VersionConstant
    :   DigitSequence ('.' DigitSequence)*
    ;

DigitSequence
    :   Digit+
    ;

fragment
Nondigit
    :   [a-zA-Z_]
    ;

fragment
Digit
    :   [0-9]
    ;

StringLiteral
    :   '"' SCharSequence? '"'
    |   '\'' SCharSequence? '\''
    ;

fragment
SCharSequence
    :   SChar+
    ;

fragment
SChar
    :   ~["\\\r\n]
    |   '\\\n'   // Added line
    |   '\\\r\n' // Added line
    ;

Whitespace
    :   [ \t]+
        -> skip
    ;

Newline
    :   (   '\r' '\n'?
        |   '\n'
        )
        -> skip
    ;

BlockComment
    :   '/*' .*? '*/'
        -> skip
    ;

LineComment
    :   '//' ~[\r\n]*
        -> skip
    ;

我使用如下单元测试测试了规则手册解析器:

I tested the Rulebook parser with unit test like below:

    public void testScanRulebookFile() throws IOException {
        String fileName = "C:\\rulebooks\\demo.rb";
        FileInputStream fis = new FileInputStream(fileName);
        // create a CharStream that reads from standard input
        CharStream input = CharStreams.fromStream(fis);

        // create a lexer that feeds off of input CharStream
        RulebookLexer lexer = new RulebookLexer(input);

        // create a buffer of tokens pulled from the lexer
        CommonTokenStream tokens = new CommonTokenStream(lexer);

        // create a parser that feeds off the tokens buffer
        RulebookParser parser = new RulebookParser(tokens);


        RulebookStatementContext context = parser.rulebookStatement();
//        WhenThenStatementContext context = parser.whenThenStatement();

        System.out.println(context.toStringTree(parser));

//      ParseTree tree = parser.getContext(); // begin parsing at init rule
//      System.out.println(tree.toStringTree(parser)); // print LISP-style tree
    }

对于上面的demo.rb"，解析器得到如下错误.我还将 RulebookStatementContext 打印为 toStringTree.

For the "demo.rb" as above, the parser got the error as below. I also print the RulebookStatementContext as toStringTree.

line 12:25 mismatched input '&&' expecting ')'
(rulebookStatement rulebook Titanic-Normalization { version 1 (metaStatement meta { description "Test" source "my-rules.xslx" user "joltie" }) (ruleStatement rule remove-first-line { description "Removes first line when offset is zero" (whenThenStatement when ( (expression (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (postfixExpression (primaryExpression present)) ( (argumentExpressionList (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (primaryExpression offset))))))))))))))))) ))))))))))))))))) && offset == 0 ) then { filter-row-if-true true ;) }) })

我还编写了单元测试来测试短输入上下文，例如 "when (offset == 0) then {\n" + "filter-row-if-true true;\n" + "}\n" 来调试问题.但它仍然得到这样的错误:

I also write the unit test to test short input context like "when (offset == 0) then {\n" + "filter-row-if-true true;\n" + "}\n" to debug the problem. But it still got the error like:

line 1:16 mismatched input '0' expecting {'(', '++', '--', '&&', '&', '*', '+', '-', '~', '!', Identifier, GeneralIdentifier, DigitSequence, StringLiteral}
line 2:19 extraneous input 'true' expecting {'(', '++', '--', '&&', '&', '*', '+', '-', '~', '!', ';', Identifier, GeneralIdentifier, DigitSequence, StringLiteral}

尝试了两天，我没有任何进展.问题就如上，请高人给我一些关于如何调试ANTLR4语法无关/不匹配输入错误的建议.

With two day's tries, I didn't got any progress. The question is so long as above, please someone give me some advises about how to debug ANTLR4 grammar extraneous / mismatched input error.

如何调试ANTLR4语法无关/不匹配的输入错误 [英] How to debug ANTLR4 grammar extraneous / mismatched input error

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何调试ANTLR4语法无关/不匹配的输入错误 [英] How to debug ANTLR4 grammar extraneous / mismatched input error

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭