ANTLR4 语法 - “点"问题在字段和扩展表达式中 [英] ANTLR4 Grammar - Issue with "dot" in fields and extended expressions

查看:18
本文介绍了ANTLR4 语法 - “点"问题在字段和扩展表达式中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下 ANTLR4 语法

I have the following ANTLR4 Grammar

grammar ExpressionGrammar;

parse: (expr)
     ;

expr: MIN expr
    | expr ( MUL | DIV ) expr
    | expr ( ADD | MIN ) expr
    | NUM
    | function
    | '(' expr ')'
    ;

function : ID '(' arguments? ')';

arguments: expr ( ',' expr)*;

/* Tokens */

MUL : '*';
DIV : '/';
MIN : '-';
ADD : '+';
OPEN_PAR : '(' ;
CLOSE_PAR : ')' ;

NUM : '0' | [1-9][0-9]*;
ID : [a-zA-Z_] [a-zA-Z]*;
COMMENT: '//' ~[\r\n]* -> skip;
WS: [ \t\n]+ -> skip;

我有一个这样的输入表达式:-

I have an input expression like this :-

(Fields.V1)*(Fields.V2) + (Constants.Value1)*(Constants.Value2)

ANTLR 解析器根据上述语法生成以下文本:-

The ANTLR parser generated the following text from the grammar above :-

(FieldsV1)*(FieldsV2)+(Constants<missing ')'> 

如您所见,点"文本中缺少 Fields.V1 和 Fields.V2,并且还有一个 <missing ')' 错误节点.我相信我应该以某种方式让 ANTLR 理解表达式也可以包含带有点运算符的字段.

As you can see, the "dots" in Fields.V1 and Fields.V2 are missing from the text and also there is a <missing ')' Error node. I believe I should somehow make ANTLR understand that an expression can also have fields with dot operators.

除此之外的一个问题:-

A question on top of this :-

 (Var1)(Var2)    

ANTLR 没有在上面的场景中抛出错误,表达式不应该是 (Var1)(Var2)——它应该总是有运算符 (var1)*(var2) 或 (var1)+(var2) 等.解析器错误树未生成此错误.应该如何修改语法以确保甚至考虑到这种情况.

ANTLR is not throwing me error for this above scenario , the expressions should not be (Var1)(Var2) -- It should always have the operator (var1)*(var2) or (var1)+(var2) etc. The parser error tree is not generating this error. How should the grammar be modified to make sure even this scenario is taken into consideration.

推荐答案

要识别 IDFields.V1,请更改 ID<的 Lexer 规则/code> 到这样的:

To recognize IDs like Fields.V1, change you Lexer rule for ID to something like this:

fragment ID_NODE: [a-zA-Z_][a-zA-Z0-9]*;
ID: ID_NODE ('.' ID_NODE)*;

注意,因为每个节点"的 ID 遵循相同的规则,我将它作为一个词法分析器片段,我可以用它来组成 ID 规则.我还在片段的第二部分添加了 0-9,因为您似乎希望在 IDs

Notice, since each "node" of the ID follows the same rule, I made it a lexer fragment that I could use to compose the ID rule. I also added 0-9 to the second part of the fragment, since it appears that you want to allow numbers in IDs

然后 ID 规则使用片段来构建允许 ID 中的点的词法分析器规则.

Then the ID rule uses the fragment to build out the Lexer rule that allows for dots in the ID.

您也没有将 ID 添加为有效的 expr 替代项

You also didn't add ID as a valid expr alternative

为了检测(Var1)(Var2)中的错误条件,您需要Mike的建议,将EOF Lexer规则添加到parse解析器规则的末尾.如果没有 EOF,ANTLR 将在到达识别的 expr ((Var1)) 的末尾时停止解析.EOF 说然后你需要找到一个 EOF",所以 ANTLR 会继续解析到 (Var2) 并给你错误.

To handle detection of the error condition in (Var1)(Var2), you need Mike's advice to add the EOF Lexer rule to the end of the parse parser rule. Without the EOF, ANTLR will stop parsing as soon as it reaches the end of a recognized expr ((Var1)). The EOF says "and then you need to find an EOF", so ANTLR will continue parsing into the (Var2) and give you the error.

处理您的两个示例的修订版:

A revised version that handles both of your examples:

grammar ExpressionGrammar;

parse: expr EOF;

expr:
    MIN expr
    | expr ( MUL | DIV) expr
    | expr ( ADD | MIN) expr
    | NUM
    | ID
    | function
    | '(' expr ')';

function: ID '(' arguments? ')';

arguments: expr ( ',' expr)*;

/* Tokens */

MUL: '*';
DIV: '/';
MIN: '-';
ADD: '+';
OPEN_PAR: '(';
CLOSE_PAR: ')';

NUM: '0' | [1-9][0-9]*;
fragment ID_NODE: [a-zA-Z_][a-zA-Z0-9]*;
ID: ID_NODE ('.' ID_NODE)*;
COMMENT: '//' ~[\r\n]* -> skip;
WS: [ \t\n]+ -> skip;

(现在我已经阅读了评论,这几乎只是应用了评论中的建议)

(Now that I've read through the comments, this is pretty much just applying the suggestions in the comments)

这篇关于ANTLR4 语法 - “点"问题在字段和扩展表达式中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆