ANTLR如何使用具有相同开始的词法分析器规则? [英] ANTLR How to use lexer rules having same starting?
问题描述
如何使用开头相同的词法分析器规则?
How to use lexer rules having same starting?
我正在尝试使用两个相似的词法分析器规则(具有相同的开头):
I am trying to use two similar lexer rules (having the same starting):
TIMECONSTANT: ('0'..'9')+ ':' ('0'..'9')+;
INTEGER : ('0'..'9')+;
COLON : ':';
这是我的示例语法:
grammar TestTime;
text : (timeexpr | caseblock)*;
timeexpr : TIME;
caseblock : INT COLON ID;
TIME : ('0'..'9')+ ':' ('0'..'9')+;
INT : ('0'..'9')+;
COLON : ':';
ID : ('a'..'z')+;
WS : (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};
当我尝试解析文本时:
12:44
123 : abc
123: abc
正确解析了前两行,第3行-产生错误. 由于某些原因,"123:" ANTLR解析为TIME(而并非如此)...
First two lines are parsed correctly, 3rd - generates error. For some reason, '123:' ANTLR parses as TIME (while it is not)...
那么,有可能用这样的词法来做语法吗?
So, is it possible to make grammar with such lexems?
在我的语言中,必须具有这样的规则才能同时使用大小写块和日期时间常量.例如,用我的语言可以写:
Having such rules is necessary in my language for using both case-blocks and datetime constants. For example in my language it is possible to write:
case MyInt of
1: a := 01.01.2012;
2: b := 12:44;
3: ....
end;
推荐答案
一旦匹配了DIGIT+ ':'
,词法分析器便期望其后跟随另一个DIGIT
来匹配TIMECONSTANT
.如果这没有发生,则它不能依赖于另一个匹配DIGIT+ ':'
的词法分析器规则,并且该词法分析器不会放弃已经匹配的':'
来匹配INTEGER
.
As soon DIGIT+ ':'
is matched, the lexer expects this to be followed by another DIGIT
to match a TIMECONSTANT
. If this does not happen, it cannot fall back on another lexer rule that matches DIGIT+ ':'
and the lexer will not give up on the already matched ':'
to match an INTEGER
.
一种可能的解决方案是在INTEGER
规则的末尾有选择地匹配':' DIGIT+
,并在匹配时更改令牌的类型:
A possible solution would be to optionally match ':' DIGIT+
at the end of the INTEGER
rule and change the type of the token if this gets matched:
grammar T;
parse
: (t=. {System.out.printf("\%-15s '\%s'\n", tokenNames[$t.type], $t.text);})* EOF
;
INTEGER : DIGIT+ ((':' DIGIT)=> ':' DIGIT+ {$type=TIMECONSTANT;})?;
COLON : ':';
SPACE : ' ' {skip();};
fragment DIGIT : '0'..'9';
fragment TIMECONSTANT : ;
解析输入时:
11: 12:13 : 14
将打印以下内容:
INTEGER '11'
COLON ':'
TIMECONSTANT '12:13'
COLON ':'
INTEGER '14'
编辑
不太好,但是可以...
Not too nice, but works...
是的.但是,这并不是ANTLR的不足:我知道的大多数词法分析器生成器都会遇到对这样的TIMECONSTANT
正确标记的问题(当同时存在INTEGER
和COLON
时). ANTLR至少有助于在词法分析器中进行处理:)
True. However, this is not an ANTLR short coming: most lexer generators I know will have a problem properly tokenizing such a TIMECONSTANT
(when INTEGER
and COLON
are also present). ANTLR at least facilitates a way to handle it in the lexer :)
您可以也可以让它由解析器而不是词法分析器来处理:
You could also let this be handled by the parser instead of the lexer:
time_const : INTEGER COLON INTEGER;
INTEGER : '0'..'9'+;
COLON : ':';
SPACE : ' ' {skip();};
但是,如果您语言的词法分析器忽略空格,则输入如下内容:
However, if your language's lexer ignores white spaces, then input like:
12 : 34
当然,
也将通过time_const
规则匹配.
这篇关于ANTLR如何使用具有相同开始的词法分析器规则?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!