Antlr 外部输入 [英] Antlr Extraneous Input

查看:22
本文介绍了Antlr 外部输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个语法文件 BoardFile.g4,其中包含(仅相关部分):

I have a grammar file BoardFile.g4 that has (relevant parts only):

grammar Board;

//Tokens
GADGET : 'squareBumper' | 'circleBumper' | 'triangleBumper' | 'leftFlipper' | 'rightFlipper' | 'absorber' | 'portal' ;
NAME : [A-Za-z_][A-Za-z_0-9]* ;
INT : [0-9]+ ;
FLOAT : '-'?[0-9]+('.'[0-9]+)? ;
COMMENT : '#' ~( '\r' | '\n' )*;
WHITESPACE : [ \t\r\n]+ -> skip ;
KEY : [a-z] | [0-9] | 'shift' | 'ctrl' | 'alt' | 'meta' | 'space' | 'left' | 'right' | 'up' | 'down' | 'minus' | 'equals' | 'backspace' | 'openbracket' | 'closebracket' | 'backslash' | 'semicolon' | 'quote' | 'enter' | 'comma' | 'period' | 'slash' ;
KEYPRESS : 'keyup' | 'keydown' ;

//Rules
file : define+ EOF ;
define : board | ball | gadget | fire | COMMENT | key ;
board : 'board' 'name' '=' name ('gravity' '=' gravity)? ('friction1' '=' friction1)? ('friction2' '=' friction2)? ;
ball : 'ball' 'name' '=' name 'x' '=' xfloat 'y' '=' yfloat 'xVelocity' '=' xvel 'yVelocity' '=' yvel ;
gadget : gadgettype 'name' '=' name 'x' '=' xint 'y' '=' yint ('width' '=' width 'height' '=' height)? ('orientation' '=' orientation)? ('otherBoard' '=' name 'otherPortal' '=' name)? ;
fire : 'fire' 'trigger' '=' trigger 'action' '=' action ;
key : keytype 'key' '=' KEY 'action' '=' name ;

name : NAME ;
gadgettype : GADGET ;
keytype : KEYPRESS ;
gravity : FLOAT ;
friction1 : FLOAT ;
friction2 : FLOAT ;
trigger : NAME ;
action : NAME ;
yfloat : FLOAT ;
xfloat : FLOAT ;
yint : INT ;
xint : INT ;
xvel : FLOAT ;
yvel : FLOAT ;
orientation : INT ;
width : INT ;
height : INT ;

这可以很好地生成词法分析器和解析器.但是,当我针对以下文件使用它时,会出现以下错误:

This generates the lexer and parser fine. However, when I use it against the following file, it gives the following error:

line 12:0 extraneous input 'keyup' expecting {<EOF>, KEYPRESS}

要解析的文件:

板名称=keys板重力=5.0摩擦1=0.0摩擦2=0.0

board name=keysBoard gravity=5.0 friction1=0.0 friction2=0.0

# define a ball
ball name=Ball x=0.5 y=0.5 xVelocity=2.5 yVelocity=2.5

# add some flippers
leftFlipper name=FlipL1 x=16 y=2 orientation=0
leftFlipper name=FlipL2 x=16 y=9 orientation=0

# add keys. lots of keys.
keyup key=space action=apple
keydown key=a action=ball
keyup key=backslash action=cat
keydown key=period action=dog

我在 SO 中解决了有关此错误的其他问题,但没有一个对我有帮助.我无法弄清楚出了什么问题.为什么我会收到此错误?

I went through other questions about this error in SO but none helped me. I cannot figure out what's going wrong. Why am I getting this error?

推荐答案

字符串 "keyup" 被标记为 NAME 标记:这就是问题所在.

The string "keyup" is being tokenized as a NAME token: that is the problem.

您必须意识到词法分析器独立于解析器运行.如果解析器试图匹配一个 KEYPRESS 标记,词法分析器不会听"它,而只是按照规则构造一个标记:

You must realize that the lexer operates independently from the parser. If the parser is trying to match a KEYPRESS token, the lexer does not "listen" to it, but just constructs a token following the rules:

  1. 匹配消耗最多字符的规则
  2. 如果有更多规则匹配相同数量的字符,则选择第一个定义的规则

考虑到这些规则以及规则的顺序:

Taking these rules into account, and the order of your rules:

NAME : [A-Za-z_][A-Za-z_0-9]* ;

INT : [0-9]+ ;

KEY : [a-z] | [0-9] | 'shift' | 'ctrl' | 'alt' | 'meta' | 'space' | 'left' | 'right' | 'up' | 'down' | 'minus' | 'equals' | 'backspace' | 'openbracket' | 'closebracket' | 'backslash' | 'semicolon' | 'quote' | 'enter' | 'comma' | 'period' | 'slash' ;

KEYPRESS : 'keyup' | 'keydown' ;

NAME 令牌将在大多数 KEY 替代方案之前创建,并且所有 KEYPRESS 替代方案都将被创建.

a NAME token will be created before most of the KEY alternatives, and all of the KEYPRESS alternatives will be created.

并且由于 INT 匹配一个或多个数字并且在 KEY 之前定义,它也有一个数字替代,很明显词法分析器永远不会产生 KEYKEYPRESS 令牌.

And since an INT matches one or more digits and is defined before KEY which also has a single digit alternative, it is clear that the lexer will never produce a KEY or KEYPRESS token.

如果将 NAMEINT 规则移到 KEYKEYPRESS 规则下方,那么大部分令牌将按照您的预期构建,这是我的猜测.

If you move the NAME and INT rule below the KEY and KEYPRESS rules, then most of the tokens will be constructed as you expect, is my guess.

可能的解决方案如下:

KEY : [a-z] | 'shift' | 'ctrl' | 'alt' | 'meta' | 'space' | 'left' | 'right' | 'up' | 'down' | 'minus' | 'equals' | 'backspace' | 'openbracket' | 'closebracket' | 'backslash' | 'semicolon' | 'quote' | 'enter' | 'comma' | 'period' | 'slash' ;

KEYPRESS : 'keyup' | 'keydown' ;

NAME : [A-Za-z_][A-Za-z_0-9]* ;

SINGLE_DIGIT : [0-9] ;

INT : [0-9]+ ;

即我从 KEY 中删除了 [0-9] 替代项,并引入了一个 SINGLE_DIGIT 规则(它位于之前INT 规则!).

I.e. I removed the [0-9] alternative from KEY and introduced a SINGLE_DIGIT rule (which is placed before the INT rule!).

现在创建一些额外的解析器规则:

Now create some extra parser rules:

integer : INT | SINGLE_DIGIT ;

key : KEY | SINGLE_DIGIT ;

并将解析器规则中所有出现的 INT 更改为 integer (不要调用您的规则 int:它是一个保留字)并将所有 KEY 更改为 key.

and change all occurrences of INT inside parser rules to integer (don't call your rule int: it is a reserved word) and change all KEY to key.

而且你可能还想做一些类似于 NAME[az]KEY 中的替代品(即单个小写字符将现在永远不会被标记为 NAME,总是作为 KEY).

And you might also want to do something similar to NAME and the [a-z] alternative in KEY (i.e. a single lowercase char would now never be tokenized as a NAME, always as a KEY).

这篇关于Antlr 外部输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆