ANTLR语法也可以识别DIGIT键和INTEGERS [英] ANTLR grammar to recognize DIGIT keys and INTEGERS too

查看:71
本文介绍了ANTLR语法也可以识别DIGIT键和INTEGERS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建ANTLR语法来解析可选重复次数的键序列.例如,(a b c r5)表示重复键a,b和c五次".

我有用于 KEYS的语法:('a'..'z'|'A'..'Z').

但是当我尝试添加数字键 KEYS时:('a'..'z'|'A'..'Z'|'0'.'9')如果输入表达式为(a 5 r5),则解析在中间5处失败,因为它无法确定5是INTEGER还是KEY.(或者,我认为;错误消息很难解释为"NoViableAltException").

我尝试了这些语法形式,这些形式很有效("r"表示"repeatcount"):

  repeat:'('LETTERKEYS INTEGER')'-适用于a-zA-Z重复:'('LETTERKEYS'r'INTEGER')';-适用于a-zA-Z 

但是我失败了

  repeat:'('LETTERSandDIGITKEYS INTEGER')'-在'(a 5 r5)'上失败重复:'('LETTERSandDIGITKEYS'r'INTEGER')';-失败于'(a 5 r5)' 

也许语法无法识别;也许我需要以相同的方式(如KEYS或DIGITS或INTEGERS)来识别所有5个键,并且在解析树中,访问者将中间的DIGIT实例解释为键,而将最后一组DIGITS解释为INTEGER计数?

是否可以定义一种允许我重复数字键和字母键的语法,以便正确识别(a 5 123 r5)之类的表达式?(也就是说,将a,5、1、2、3重复键五次.")尽管使用类似的命令可能会很好,但我并不受该特定语法的束缚.

谢谢.

解决方案

在中间的5处解析失败,因为它无法确定5是INTEGER还是KEY.

如果您定义了以下规则:

  INTEGER:[0-9] +;密钥:[a-zA-Z0-9]; 

然后一个数字,例如您的示例中的 5 ,将始终成为 INTEGER 令牌.即使如果解析器尝试匹配 KEY 令牌,则 5 将成为 INTEGER .空无一物您可以做到这一点:这就是ANTLR的词法分析器的工作方式.该词法分析器的工作方式如下:

  1. 尝试消耗尽可能多的字符(赢得最长的比赛)
  2. 如果2个或更多规则与相同字符匹配(如 5 ,例如 INTEGER KEY ),则首先定义该规则"; win"

如果您希望 5 INTEGER ,但有时又是 KEY ,请执行以下操作:

  key:KEY |SINGLE_DIGIT |R;整数:INTEGER |SINGLE_DIGIT;重复:R整数;SINGLE_DIGIT:[0-9];整数:[0-9] +;R:'r';密钥:[a-zA-Z]; 

,并且在解析器规则中,您使用 key integer 而不是 KEY INTEGER .

I'm trying to create an ANTLR grammar to parse sequences of keys that optionally have a repeat count. For example, (a b c r5) means "repeat keys a, b, and c five times."

I have the grammar working for KEYS : ('a'..'z'|'A'..'Z').

But when I try to add digit keys KEYS : ('a'..'z'|'A'..'Z'|'0'..'9') with an input expression like (a 5 r5), the parse fails on the middle 5 because it can't tell if the 5 is an INTEGER or a KEY. (Or so I think; the error messages are difficult to interpret "NoViableAltException").

I have tried these grammatical forms, which work ('r' means "repeatcount"):

repeat : '(' LETTERKEYS INTEGER ')' - works for a-zA-Z
repeat : '(' LETTERKEYS 'r' INTEGER ')'; - works for a-zA-Z

But I fail with

repeat : '(' LETTERSandDIGITKEYS INTEGER ')' - fails on '(a 5 r5)'
repeat : '(' LETTERSandDIGITKEYS 'r' INTEGER ')'; - fails on '(a 5 r5)'

Maybe the grammar can't do the recognition; maybe I need to recognize all the 5's keys in the same way (as KEYS or DIGITS or INTEGERS) and in the parse tree visitor interpret the middle DIGIT instances as keys, and the last set of DIGITS as an INTEGER count?

Is it possible to define a grammar that allows me to repeat digit keys as well as letter keys so that expressions like (a 5 123 r5) will be recognized correctly? (That is, "repeat keys a,5,1,2,3 five times.") I'm not tied to that specific syntax, although it would be nice to use something similar.

Thank you.

解决方案

the parse fails on the middle 5 because it can't tell if the 5 is an INTEGER or a KEY.

If you have defined the following rules:

INTEGER : [0-9]+;
KEY     : [a-zA-Z0-9];

then a single digit, like 5 in your example, will always become an INTEGER token. Even if the parser is trying to match a KEY token, the 5 will become an INTEGER. There is nothing you can do about that: this is the way ANTLR's lexer works. The lexer works in the following way:

  1. try to consume as many characters as possible (the longest match wins)
  2. if 2 or more rules match the same characters (like INTEGER and KEY in case of 5), let the rule defined first "win"

If you want a 5 to be an INTEGER, but sometimes a KEY, do something like this instead:

key     : KEY | SINGLE_DIGIT | R;
integer : INTEGER | SINGLE_DIGIT;
repeat  : R integer;

SINGLE_DIGIT : [0-9];
INTEGER      : [0-9]+;
R            : 'r';
KEY          : [a-zA-Z];

and in your parser rules, you use key and integer instead of KEY and INTEGER.

这篇关于ANTLR语法也可以识别DIGIT键和INTEGERS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆