ANTLR 语法也能识别数字键和整数 [英] ANTLR grammar to recognize DIGIT keys and INTEGERS too

查看:27
本文介绍了ANTLR 语法也能识别数字键和整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个 ANTLR 语法来解析可选具有重复计数的键序列.例如,(a b c r5) 表示重复键 a、b 和 c 五次".

我的语法适用于 KEYS : ('a'..'z'|'A'..'Z').

但是当我尝试添加数字键 KEYS : ('a'..'z'|'A'..'Z'|'0'..'9') 时像 (a 5 r5) 这样的输入表达式,解析在中间 5 处失败,因为它无法判断 5 是整数还是 KEY.(或者我认为;错误消息很难解释NoViableAltException").

我尝试过这些语法形式,它们有效('r' 的意思是repeatcount"):

repeat : '(' LETTERKEYS INTEGER ')' - 适用于 a-zA-Z重复 : '(' LETTERKEYS 'r' 整数 ')';- 适用于 a-zA-Z

但我失败了

repeat : '(' LETTERSandDIGITKEYS INTEGER ')' - 在 '(a 5 r5)' 上失败重复 : '(' LETTERSandDIGITKEYS 'r' 整数 ')';- '(a 5 r5)' 失败

可能是语法做不了识别;也许我需要以相同的方式识别所有 5 的键(如 KEYS 或 DIGITS 或 INTEGERS),并且在解析树中,访问者将中间的 DIGIT 实例解释为键,并将最后一组 DIGITS 解释为 INTEGER 计数?

是否可以定义一种语法,允许我重复数字键和字母键,以便像 (a 5 123 r5) 这样的表达式被正确识别?(也就是说,重复键 a、5、1、2、3 五次.")我不依赖于那个特定的语法,尽管使用类似的东西会很好.

谢谢.

解决方案

中间的 5 解析失败,因为它无法判断 5 是整数还是键.

如果您定义了以下规则:

整数:[0-9]+;键 : [a-zA-Z0-9];

那么单个数字,例如您示例中的 5,将始终成为 INTEGER 标记.即使解析器试图匹配 KEY 标记,5 将变成 INTEGER.空无一物你可以这样做:这就是 ANTLR 的词法分析器的工作方式.词法分析器的工作方式如下:

  1. 尝试使用尽可能多的字符(最长匹配获胜)
  2. 如果有 2 个或多个规则匹配相同的字符(例如 INTEGERKEY5 的情况下),让规则先定义 "赢"

如果您希望 5 成为 INTEGER,但有时是 KEY,请执行以下操作:

key : KEY |SINGLE_DIGIT |R;整数:整数 |SINGLE_DIGIT;重复:R 整数;SINGLE_DIGIT : [0-9];整数:[0-9]+;R : 'r';键 : [a-zA-Z];

并且在您的解析器规则中,您使用 keyinteger 而不是 KEYINTEGER.

I'm trying to create an ANTLR grammar to parse sequences of keys that optionally have a repeat count. For example, (a b c r5) means "repeat keys a, b, and c five times."

I have the grammar working for KEYS : ('a'..'z'|'A'..'Z').

But when I try to add digit keys KEYS : ('a'..'z'|'A'..'Z'|'0'..'9') with an input expression like (a 5 r5), the parse fails on the middle 5 because it can't tell if the 5 is an INTEGER or a KEY. (Or so I think; the error messages are difficult to interpret "NoViableAltException").

I have tried these grammatical forms, which work ('r' means "repeatcount"):

repeat : '(' LETTERKEYS INTEGER ')' - works for a-zA-Z
repeat : '(' LETTERKEYS 'r' INTEGER ')'; - works for a-zA-Z

But I fail with

repeat : '(' LETTERSandDIGITKEYS INTEGER ')' - fails on '(a 5 r5)'
repeat : '(' LETTERSandDIGITKEYS 'r' INTEGER ')'; - fails on '(a 5 r5)'

Maybe the grammar can't do the recognition; maybe I need to recognize all the 5's keys in the same way (as KEYS or DIGITS or INTEGERS) and in the parse tree visitor interpret the middle DIGIT instances as keys, and the last set of DIGITS as an INTEGER count?

Is it possible to define a grammar that allows me to repeat digit keys as well as letter keys so that expressions like (a 5 123 r5) will be recognized correctly? (That is, "repeat keys a,5,1,2,3 five times.") I'm not tied to that specific syntax, although it would be nice to use something similar.

Thank you.

解决方案

the parse fails on the middle 5 because it can't tell if the 5 is an INTEGER or a KEY.

If you have defined the following rules:

INTEGER : [0-9]+;
KEY     : [a-zA-Z0-9];

then a single digit, like 5 in your example, will always become an INTEGER token. Even if the parser is trying to match a KEY token, the 5 will become an INTEGER. There is nothing you can do about that: this is the way ANTLR's lexer works. The lexer works in the following way:

  1. try to consume as many characters as possible (the longest match wins)
  2. if 2 or more rules match the same characters (like INTEGER and KEY in case of 5), let the rule defined first "win"

If you want a 5 to be an INTEGER, but sometimes a KEY, do something like this instead:

key     : KEY | SINGLE_DIGIT | R;
integer : INTEGER | SINGLE_DIGIT;
repeat  : R integer;

SINGLE_DIGIT : [0-9];
INTEGER      : [0-9]+;
R            : 'r';
KEY          : [a-zA-Z];

and in your parser rules, you use key and integer instead of KEY and INTEGER.

这篇关于ANTLR 语法也能识别数字键和整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆