ANTLR4语法-带有“点"的问题在字段和扩展表达式中 [英] ANTLR4 Grammar - Issue with "dot" in fields and extended expressions

查看:62
本文介绍了ANTLR4语法-带有“点"的问题在字段和扩展表达式中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下ANTLR4语法

I have the following ANTLR4 Grammar

grammar ExpressionGrammar;

parse: (expr)
     ;

expr: MIN expr
    | expr ( MUL | DIV ) expr
    | expr ( ADD | MIN ) expr
    | NUM
    | function
    | '(' expr ')'
    ;

function : ID '(' arguments? ')';

arguments: expr ( ',' expr)*;

/* Tokens */

MUL : '*';
DIV : '/';
MIN : '-';
ADD : '+';
OPEN_PAR : '(' ;
CLOSE_PAR : ')' ;

NUM : '0' | [1-9][0-9]*;
ID : [a-zA-Z_] [a-zA-Z]*;
COMMENT: '//' ~[\r\n]* -> skip;
WS: [ \t\n]+ -> skip;

我有一个这样的输入表达式:-

I have an input expression like this :-

(Fields.V1)*(Fields.V2) + (Constants.Value1)*(Constants.Value2)

ANTLR解析器从上面的语法生成以下文本:-

The ANTLR parser generated the following text from the grammar above :-

(FieldsV1)*(FieldsV2)+(Constants<missing ')'> 

如您所见,点"文本中缺少Fields.V1和Fields.V2中的,并且还有一个< missing')'错误节点.我相信我应该以某种方式使ANTLR理解表达式也可以具有带点运算符的字段.

As you can see, the "dots" in Fields.V1 and Fields.V2 are missing from the text and also there is a <missing ')' Error node. I believe I should somehow make ANTLR understand that an expression can also have fields with dot operators.

在此之上的问题:-

 (Var1)(Var2)    

在上述情况下,ANTLR不会抛出错误,表达式不应该是(Var1)(Var2)-它应该始终具有运算符(var1)*(var2)或(var1)+(var2)等.解析器错误树未生成此错误.应该如何修改语法以确保甚至考虑到这种情况.

ANTLR is not throwing me error for this above scenario , the expressions should not be (Var1)(Var2) -- It should always have the operator (var1)*(var2) or (var1)+(var2) etc. The parser error tree is not generating this error. How should the grammar be modified to make sure even this scenario is taken into consideration.

推荐答案

要像 Fields.V1 这样识别 ID ,请为 ID 变成这样:

To recognize IDs like Fields.V1, change you Lexer rule for ID to something like this:

fragment ID_NODE: [a-zA-Z_][a-zA-Z0-9]*;
ID: ID_NODE ('.' ID_NODE)*;

请注意,因为每个节点"都位于ID遵循相同的规则,因此我将它制成了词法分析器片段,可以用来构成ID规则.我还向片段的第二部分添加了 0-9 ,因为看来您想允许 ID s

Notice, since each "node" of the ID follows the same rule, I made it a lexer fragment that I could use to compose the ID rule. I also added 0-9 to the second part of the fragment, since it appears that you want to allow numbers in IDs

然后, ID 规则使用该片段构建Lexer规则,该规则允许在 ID 中使用点.

Then the ID rule uses the fragment to build out the Lexer rule that allows for dots in the ID.

您也没有添加 ID 作为有效的 expr 替代

You also didn't add ID as a valid expr alternative

要处理(Var1)(Var2)中的错误情况,您需要Mike的建议将EOF Lexer规则添加到 parse 解析器规则的末尾.如果没有 EOF ,则ANTLR一旦到达可识别的expr((Var1))的末尾,便会停止解析. EOF 说,然后您需要找到一个 EOF ",因此ANTLR将继续解析为(Var2)并给您错误.

To handle detection of the error condition in (Var1)(Var2), you need Mike's advice to add the EOF Lexer rule to the end of the parse parser rule. Without the EOF, ANTLR will stop parsing as soon as it reaches the end of a recognized expr ((Var1)). The EOF says "and then you need to find an EOF", so ANTLR will continue parsing into the (Var2) and give you the error.

处理了两个示例的修订版:

A revised version that handles both of your examples:

grammar ExpressionGrammar;

parse: expr EOF;

expr:
    MIN expr
    | expr ( MUL | DIV) expr
    | expr ( ADD | MIN) expr
    | NUM
    | ID
    | function
    | '(' expr ')';

function: ID '(' arguments? ')';

arguments: expr ( ',' expr)*;

/* Tokens */

MUL: '*';
DIV: '/';
MIN: '-';
ADD: '+';
OPEN_PAR: '(';
CLOSE_PAR: ')';

NUM: '0' | [1-9][0-9]*;
fragment ID_NODE: [a-zA-Z_][a-zA-Z0-9]*;
ID: ID_NODE ('.' ID_NODE)*;
COMMENT: '//' ~[\r\n]* -> skip;
WS: [ \t\n]+ -> skip;

(现在我已经阅读了注释,这几乎就是在注释中应用建议)

(Now that I've read through the comments, this is pretty much just applying the suggestions in the comments)

这篇关于ANTLR4语法-带有“点"的问题在字段和扩展表达式中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆