Antlr4:用双点解析点结尾的浮点数 [英] Antlr4: parse dot-ending float with double-dots
问题描述
我正在尝试解析带有点结尾浮点数和双点数范围数组的句子,但无法解析.
I'm trying to parse sentence with dot-ending float and double-dots ranging array, but cannot make it.
这是我的语法文件
grammar foo;
Digits
: [0-9]+
;
Real
: Digits* '.' Digits+
| Digits+ '.' Digits*
;
Range
: '..'
;
Whitespace
: [ \t]+
-> skip
;
Newline
: ( '\r' '\n'?
| '\n'
)
-> skip
;
range
: Digits Range Digits
;
and(filenamed foo.c
)
and(file named foo.c
)
代码 1:
1..2
代码 2:
1 ..2
我使用以下代码进行编译和测试:
I use following to compile and test:
antlr4 foo.g4
javac foo*.java
grun foo range -gui foo.c
代码 1 会出错:
line 1:2 token recognition error at: '. '
line 1:0 extraneous input '1.' expecting Digits
line 1:5 mismatched input '<EOF>' expecting '..'
不过我可以用代码 2 来实现.
However I can make it with code 2.
添加额外空间使其正确,但我想要一个语法,可以在没有额外空间的情况下解析代码 1.
Adding extra space makes it correct, but I want to have a grammar that can parse code 1 without extra space.
推荐答案
这就是 ANTLR 词法分析器的工作原理:它尝试匹配尽可能多的字符.所以输入 1..2
产生 2 个 Real
标记 1.
和 .2
,而不是 3 个标记Digits
、Range
和 Digits
.
That is how ANTLR's lexer works: it tries to match as much characters as possible. So the input 1..2
produces 2 Real
tokens 1.
and .2
, and not the 3 tokens Digits
, Range
and Digits
.
要创建 3 个令牌,您必须添加 你的词法分析器语法中的谓词.尝试这样的事情:
To create 3 tokens, you will have to add a predicate in your lexer grammar. Try something like this:
FLOAT
: [0-9]+ '.' {_input.LA(1) != '.'}?
| [0-9]* '.' [0-9]+
;
INT
: [0-9]+
;
RANGE
: '..'
;
SPACE
: [ \t\r\n] -> skip
;
如果我根据上述规则创建一个词法分析器,并将输入 "1 2. .34 56..7 8.99999"
提供给它,我会得到以下标记:
If I create a lexer from the rules above, and feed it the input "1 2. .34 56..7 8.99999"
, I get the following tokens:
INT '1'
FLOAT '2.'
FLOAT '.34'
INT '56'
RANGE '..'
INT '7'
FLOAT '8.99999'
这篇关于Antlr4:用双点解析点结尾的浮点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!