重叠规则-输入不匹配 [英] Overlapping rules - mismatched input
问题描述
我的语法(如下(从原始文本中删节下来))需要一些重叠的规则
My grammar (as follows (trimmed down from the original)) requires somewhat overlapping rules
grammar NOVIANum;
statement : (priorityStatement | integerStatement)* ;
priorityStatement : T_PRIO TwoDigits ;
integerStatement : T_INTEGER Integer ;
WS : [ \t\r\n]+ -> skip ;
T_PRIO : 'PRIO' ;
T_INTEGER : 'INTEGER' ;
Integer: OneToNine Digit* | ZERO ;
TwoDigits : Digit Digit ;
fragment OneToNine : ('1'..'9') ;
fragment Digit: ('0'..'9');
ZERO : [0] ;
因此"Integer"和"TwoDigits"在一定程度上重叠.
so "Integer" and "TwoDigits" overlap to a certain extent.
以下输入
INTEGER 10
PRIO 10
产生
line 2:5 mismatched input '10' expecting TwoDigits
当Integer在TwoDigits之前并且在
when Integer precedes TwoDigits and in
line 1:8 mismatched input '10' expecting Integer
在语法中,TwoDigits优先于Integer.
when TwoDigits precedes Integer in the grammar.
有没有办法解决这个问题?
Is there a way around this ?
谢谢-亚历克斯
感谢@GRosenberg,您的建议当然适用于这个小例子,但是当我将其集成到我的完整语法中时,肯定会导致各种不匹配的输入错误.
Thanks @GRosenberg, your suggestion, of course, worked for this small example, but when I integrated this into my full grammar it led to different mismatched input errors sure enough.
原因是另一个词法分析器规则,要求范围为"[1-4]",所以我想我会很聪明,然后将其变成
The reason being another lexer rule which requires a range of '[1-4]', so I thought I'll be clever and turn it into
grammar NOVIANum;
statement : (priorityT | integerT | levelT )* ;
priorityT : T_PRIO twoDigits ;
integerT : T_INTEGER integer ;
levelT : T_LEVEL levelNumber ;
levelNumber : ( ZERO DIGIT ) | ( OneToFour (ZERO | DIGIT) ) ;
integer: ZERO* ( DIGIT ( DIGIT | ZERO )* ) ;
twoDigits : (ZERO | DIGIT) ( ZERO | DIGIT ) ;
oneToFour : OneToFour (DIGIT | ZERO) ;
WS : [ \t\r\n]+ -> skip ;
T_INTEGER : 'INTEGER' ;
T_LEVEL : 'LEVEL' ;
T_PRIO : 'PRIO' ;
DIGIT: OneToFour | FiveToNine ;
ZERO : '0' ;
OneToFour : [1-4] ;
FiveToNine : [5-9] ;
这仍然适用于以前的输入,但是...
This still works for the previous inputs but ...
INTEGER 350
PRIO 10
LEVEL 01
LEVEL 05
LEVEL 10
LEVEL 49
产生
[@0,0:6='INTEGER',<2>,1:0]
[@1,8:8='3',<5>,1:8]
[@2,9:9='5',<5>,1:9]
[@3,10:10='0',<6>,1:10]
[@4,12:15='PRIO',<4>,2:0]
[@5,17:17='1',<5>,2:5]
[@6,18:18='0',<6>,2:6]
[@7,20:24='LEVEL',<3>,3:0]
[@8,26:26='0',<6>,3:6]
[@9,27:27='1',<5>,3:7]
[@10,29:33='LEVEL',<3>,4:0]
[@11,35:35='0',<6>,4:6]
[@12,36:36='5',<5>,4:7]
[@13,38:42='LEVEL',<3>,5:0]
[@14,44:44='1',<5>,5:6]
[@15,45:45='0',<6>,5:7]
[@16,47:51='LEVEL',<3>,6:0]
[@17,53:53='4',<5>,6:6]
[@18,54:54='9',<5>,6:7]
[@19,55:54='<EOF>',<-1>,6:8]
line 5:6 no viable alternative at input '1'
line 6:6 no viable alternative at input '4'
(statement (integerT INTEGER (integer 3 5 0)) (priorityT PRIO (twoDigits 1 0)) (levelT LEVEL (levelNumber 0 1)) (levelT LEVEL (levelNumber 0 5)) (levelT LEVEL (levelNumber 1 0)) (levelT LEVEL (levelNumber 4 9)))
我在这里想念什么?
好,当然,在这里回答我自己的问题
Ok, answering my own question here, of course
DIGIT: OneToFour | FiveToNine ;
即使在这种组合形式下,也应该定位在不该定位的位置, 所以解决这个问题的唯一方法-我可以想到-是
kicks in where it shouldn't, even in this combined form, so about the only way to get around this - I can think of - would be
grammar NOVIANum;
statement : (priorityT | integerT | levelT )* ;
priorityT : T_PRIO twoDigits ;
integerT : T_INTEGER integer ;
levelT : T_LEVEL levelNumber ;
levelNumber : ( ZERO (OneToFour | FiveToNine) | ( OneToFour (ZERO | (OneToFour | FiveToNine)) ) ) ;
integer: ZERO* ( (OneToFour | FiveToNine) ( (OneToFour | FiveToNine) | ZERO )* ) ;
twoDigits : (ZERO | (OneToFour | FiveToNine)) ( ZERO | (OneToFour | FiveToNine) ) ;
WS : [ \t\r\n]+ -> skip ;
T_INTEGER : 'INTEGER' ;
T_LEVEL : 'LEVEL' ;
T_PRIO : 'PRIO' ;
// DIGIT: OneToFour | FiveToNine;
ZERO : '0' ;
OneToFour : [1-4] ;
FiveToNine : [5-9] ;
因为当我为它创建解析器规则时
because when I create a parser rule for it like
oneToNine : OneToFour | FiveToNine ;
它会给我这个
integerT INTEGER (integer (oneToNine 3) (oneToNine 5) 0))
这不仅丑陋而且难以处理
which is ugly and harder to handle than just
(integerT INTEGER (integer 3 5 0))
推荐答案
作为设计的一般性问题,请始终尝试在同一个级别(解析器或词法分析器)中使用区分元素及其对象(T_PRIO-> TwoDigits).假定Integer
和TwoDigits
规则的语义本质很重要,将它们提升为解析器,然后让词法分析器仅生成数字.也就是说,不要过度限制词法分析器.
As an general issue of design, always try to work with distinguishing elements and their objects (T_PRIO -> TwoDigits) at the same level, parser or lexer. Presuming the semantic nature of the Integer
and TwoDigits
rules is important, promote them to the parser and let the lexer only produce digits. That is, don't over-constrain the lexer.
在分析器中,可以使integer
规则功能性隐藏twoDigits
规则,但对priorityStatement
规则的评估除外:
In the parser, you can let the integer
rule functionally hide the twoDigits
rule except in the evaluation of the priorityStatement
rule:
priorityStatement : T_PRIO twoDigits ;
integerStatement : T_INTEGER integer ;
integer: ZERO | ( DIGIT ( DIGIT | ZERO )* ) ;
twoDigits : DIGIT DIGIT ;
T_PRIO : 'PRIO' ;
T_INTEGER : 'INTEGER' ;
DIGIT : [1-9] ;
ZERO : '0' ;
这篇关于重叠规则-输入不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!