如何强制ANTLR解析所有输入CharStream [英] How to force ANTLR to parse all input CharStream

查看：415 发布时间：2020/9/2 22:49:29 java parsing antlr antlr4

本文介绍了如何强制ANTLR解析所有输入CharStream的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用ANTLR4解析语法文件.当我使用BaseErrorListener来检测错误时，出现了问题.当遇到非法输入字符串时，ANTLR自动匹配相应的分支，然后忽略后续的字符流，即使其中包含错误.我想检测该错误.这是我的g4文件和java文件.
TransitionLexer 是我的词法分析器文件， TransitionCondition 是我的解析器文件. ErrorDialogListener.java 是我的errorListener和 Test.java ID主Java文件.

I'm using ANTLR4 to parse a syntax file. When I use BaseErrorListener to detect errors, I got a problem. When faced with an illegal input string, ANTLR automatically matches the appropriate branch and then ignores the subsequent stream of characters even if it contains errors. And I want to detect that error. Here are my g4 file and java file.
TransitionLexer is my lexer file and TransitionCondition is my parser file. ErrorDialogListener.java is my errorListener and Test.java id main java file.

TransitionLexer.g4

lexer grammar TransitionLexer;

BOOLEAN: 'true' | 'false';
IF: 'if';
THEN: 'then';
ELSE: 'else';

NAME: (ALPHA | CHINESE | '_')(ALPHA | CHINESE | '_'|DIGIT)*;

ALPHA: [a-zA-Z];
CHINESE: [\u4e00-\u9fa5];

NUMBER: INT | REAL;
INT: DIGIT+
    |'(-'DIGIT+')';
REAL: DIGIT+ ('.' DIGIT+)?
    | '(-' DIGIT+ ('.' DIGIT+)? ')';
fragment DIGIT: [0-9];

OPCOMPARE: '='|'>='|'<='|'>'|'<';
WS: [ \t\n\r]+ ->skip;
SL_COMMENT:  '/*' .*? '*/' ->skip;

TransitionCondition.g4

grammar TransitionCondition;
import TransitionLexer;

condition : stat+;
stat : expr;
expr: expr (('and' | 'or') expr)+
    | '(' expr ')'
    | '(' var OPCOMPARE value ')'
    | booleanExpr
    | BOOLEAN
    ;

var: localStates
     | globalStates
     | connector
     ;
localStates: NAME;
globalStates: 'Top' ('.' brick)+ '.' NAME;
connector: brick '.' NAME;

value: userdefinedValue | basicValue;
userdefinedValue: NAME;
basicValue: basicValue op=('*'|'/') basicValue
                    | basicValue op=('+' | '-') basicValue
                    | basicValue ('and' | 'or') basicValue
                    | NUMBER | BOOLEAN
                    | '(' basicValue ')'
                    ;

booleanExpr: booleanExpr OPCOMPARE booleanExpr
           | '(' booleanExpr ')'
           | NUMBER (OPCOMPARE|'*'| '/'|'+'|'-') NUMBER
           ;
brick: NAME;

ErrorDialogListener.java

package errorprocess;

import java.awt.Color;
import java.awt.Container;
import java.util.Collections;
import java.util.List;

import javax.swing.JDialog;
import javax.swing.JFrame;
import javax.swing.JLabel;

import org.antlr.v4.runtime.BaseErrorListener;
import org.antlr.v4.runtime.Parser;
import org.antlr.v4.runtime.RecognitionException;
import org.antlr.v4.runtime.Recognizer;
import org.antlr.v4.runtime.atn.ATNConfigSet;
import org.antlr.v4.runtime.dfa.DFA;

public class ErrorDialogListener extends BaseErrorListener {


    @Override
    public void reportContextSensitivity(Parser recognizer, DFA dfa, int startIndex, int stopIndex, int prediction,
            ATNConfigSet configs) {
        System.out.println(dfa.toLexerString());
        System.out.println(dfa.getStates());        
        super.reportContextSensitivity(recognizer, dfa, startIndex, stopIndex, prediction, configs);
    }

    @Override
    public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine,
            String msg, RecognitionException e) {
        List<String> stack = ((Parser)recognizer).getRuleInvocationStack();
        Collections.reverse(stack);
        StringBuilder buf = new StringBuilder();
        buf.append("rule stack: "+stack+" ");
        buf.append("line "+line+":"+charPositionInLine+" at "+
                   offendingSymbol+": "+msg);

        JDialog dialog = new JDialog();
        Container contentPane = dialog.getContentPane();
        contentPane.add(new JLabel(buf.toString()));
        contentPane.setBackground(Color.white);
        dialog.setTitle("Syntax error");
        dialog.pack();
        dialog.setLocationRelativeTo(null);
        dialog.setDefaultCloseOperation(JFrame.DISPOSE_ON_CLOSE);
        dialog.setVisible(true);
    }

}

Test.java

package errorprocess;

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.atn.PredictionMode;

import antlr4.my.transition.TransitionConditionLexer;
import antlr4.my.transition.TransitionConditionParser;

public class Test {

    public static void main(String[] args) throws IOException {
        InputStream in = new FileInputStream("G:\\AltaRica\\ANTLR4\\test\\condition\\t.expr");
        ANTLRInputStream input = new ANTLRInputStream(in);
        TransitionConditionLexer lexer = new TransitionConditionLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        TransitionConditionParser parser = new TransitionConditionParser(tokens);
        parser.removeErrorListeners();
        parser.addErrorListener(new ErrorDialogListener());
//        parser.addErrorListener(new DiagnosticErrorListener());
//        parser.getInterpreter().setPredictionMode(PredictionMode.LL_EXACT_AMBIG_DETECTION);
//        parser.getInterpreter().setPredictionMode(PredictionMode.LL);
        parser.condition();
    }

}

主要问题

当我输入 (Top.b2.states =标称值)和(b1.i1 =错误)和(状态> = 5.5)，解析器可以正常工作.
但是当我的输入是(Top.b2.states =标称值)aaa(b1.i1 =错误)和(states> = 5.5)时，解析器仅解析(Top.b2.states =标称)，并忽略 aaa 之后的单词，这在语法文件中是不正确的.
我想原因是解析器遵循了我在TransitionCondition.g4中的第一个规则的第二个分支，即 expr:'('expr')'，而忽略了其他规则.那么在这种情况下如何强制ANTLR识别所有输入，或者如何强制ANTLR仅选择第一个分支( expr:expr(('and'|'or')expr)+ )?

The main problem

When my input is (Top.b2.states = nominal) and (b1.i1 = wrong) and (states >= 5.5), the parser works fine.
But when my input is (Top.b2.states = nominal) aaa (b1.i1 = wrong) and (states >= 5.5), the parser only parse (Top.b2.states = nominal) and ignores words after aaa which is not right with syntax file.
I guess the reason is that the parser follows the second branch of my first rule in TransitionCondition.g4, which is expr: '(' expr ')', and simply ignores others. So How to force ANTLR recognize all input or how to force ANTLR only choose the first branch(expr: expr (('and' | 'or') expr)+) in this situation?

我尝试使用DiagnosticErrorListener或重写reportContextSensitivity()，但似乎不起作用.

I tried to use DiagnosticErrorListener or override reportContextSensitivity() but it seems not worked.

如何强制ANTLR解析所有输入CharStream [英] How to force ANTLR to parse all input CharStream

问题描述

主要问题

The main problem

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何强制ANTLR解析所有输入CharStream [英] How to force ANTLR to parse all input CharStream

问题描述

主要问题

The main problem

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭