ANTLR4 语法 - “点"问题在字段和扩展表达式中 [英] ANTLR4 Grammar - Issue with "dot" in fields and extended expressions
问题描述
我有以下 ANTLR4 语法
I have the following ANTLR4 Grammar
grammar ExpressionGrammar;
parse: (expr)
;
expr: MIN expr
| expr ( MUL | DIV ) expr
| expr ( ADD | MIN ) expr
| NUM
| function
| '(' expr ')'
;
function : ID '(' arguments? ')';
arguments: expr ( ',' expr)*;
/* Tokens */
MUL : '*';
DIV : '/';
MIN : '-';
ADD : '+';
OPEN_PAR : '(' ;
CLOSE_PAR : ')' ;
NUM : '0' | [1-9][0-9]*;
ID : [a-zA-Z_] [a-zA-Z]*;
COMMENT: '//' ~[\r\n]* -> skip;
WS: [ \t\n]+ -> skip;
我有一个这样的输入表达式:-
I have an input expression like this :-
(Fields.V1)*(Fields.V2) + (Constants.Value1)*(Constants.Value2)
ANTLR 解析器根据上述语法生成以下文本:-
The ANTLR parser generated the following text from the grammar above :-
(FieldsV1)*(FieldsV2)+(Constants<missing ')'>
如您所见,点"文本中缺少 Fields.V1 和 Fields.V2,并且还有一个 <missing ')' 错误节点.我相信我应该以某种方式让 ANTLR 理解表达式也可以包含带有点运算符的字段.
As you can see, the "dots" in Fields.V1 and Fields.V2 are missing from the text and also there is a <missing ')' Error node. I believe I should somehow make ANTLR understand that an expression can also have fields with dot operators.
除此之外的一个问题:-
A question on top of this :-
(Var1)(Var2)
ANTLR 没有在上面的场景中抛出错误,表达式不应该是 (Var1)(Var2)——它应该总是有运算符 (var1)*(var2) 或 (var1)+(var2) 等.解析器错误树未生成此错误.应该如何修改语法以确保甚至考虑到这种情况.
ANTLR is not throwing me error for this above scenario , the expressions should not be (Var1)(Var2) -- It should always have the operator (var1)*(var2) or (var1)+(var2) etc. The parser error tree is not generating this error. How should the grammar be modified to make sure even this scenario is taken into consideration.
推荐答案
要识别 ID
像 Fields.V1
,请更改 ID<的 Lexer 规则/code> 到这样的:
To recognize ID
s like Fields.V1
, change you Lexer rule for ID
to something like this:
fragment ID_NODE: [a-zA-Z_][a-zA-Z0-9]*;
ID: ID_NODE ('.' ID_NODE)*;
注意,因为每个节点"的 ID 遵循相同的规则,我将它作为一个词法分析器片段,我可以用它来组成 ID 规则.我还在片段的第二部分添加了 0-9
,因为您似乎希望在 ID
s
Notice, since each "node" of the ID follows the same rule, I made it a lexer fragment that I could use to compose the ID rule. I also added 0-9
to the second part of the fragment, since it appears that you want to allow numbers in ID
s
然后 ID
规则使用片段来构建允许 ID
中的点的词法分析器规则.
Then the ID
rule uses the fragment to build out the Lexer rule that allows for dots in the ID
.
您也没有将 ID
添加为有效的 expr
替代项
You also didn't add ID
as a valid expr
alternative
为了检测(Var1)(Var2)
中的错误条件,您需要Mike的建议,将EOF Lexer规则添加到parse
解析器规则的末尾.如果没有 EOF
,ANTLR 将在到达识别的 expr ((Var1)
) 的末尾时停止解析.EOF
说然后你需要找到一个 EOF
",所以 ANTLR 会继续解析到 (Var2)
并给你错误.
To handle detection of the error condition in (Var1)(Var2)
, you need Mike's advice to add the EOF Lexer rule to the end of the parse
parser rule. Without the EOF
, ANTLR will stop parsing as soon as it reaches the end of a recognized expr ((Var1)
). The EOF
says "and then you need to find an EOF
", so ANTLR will continue parsing into the (Var2)
and give you the error.
处理您的两个示例的修订版:
A revised version that handles both of your examples:
grammar ExpressionGrammar;
parse: expr EOF;
expr:
MIN expr
| expr ( MUL | DIV) expr
| expr ( ADD | MIN) expr
| NUM
| ID
| function
| '(' expr ')';
function: ID '(' arguments? ')';
arguments: expr ( ',' expr)*;
/* Tokens */
MUL: '*';
DIV: '/';
MIN: '-';
ADD: '+';
OPEN_PAR: '(';
CLOSE_PAR: ')';
NUM: '0' | [1-9][0-9]*;
fragment ID_NODE: [a-zA-Z_][a-zA-Z0-9]*;
ID: ID_NODE ('.' ID_NODE)*;
COMMENT: '//' ~[\r\n]* -> skip;
WS: [ \t\n]+ -> skip;
(现在我已经阅读了评论,这几乎只是应用了评论中的建议)
(Now that I've read through the comments, this is pretty much just applying the suggestions in the comments)
这篇关于ANTLR4 语法 - “点"问题在字段和扩展表达式中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!