antlr4 无法将文字提取到令牌中 [英] antlr4 can't extract literal into token

查看:21
本文介绍了antlr4 无法将文字提取到令牌中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下语法,正在尝试慢慢开始,努力移动复杂的参数.

I have the following grammar and am trying to start out slowly, working up to move complex arguments.

grammar Command;

commands : command+ EOF; 
command : NAME args NL;
args : arg | ;

arg : DASH LOWER | LOWER;
//arg : DASH 'a' | 'x';

NAME : [_a-zA-Z0-9]+;
NL : '\n';
WS : [ \t\r]+ -> skip ; // spaces, tabs, newlines
DASH : '-';
LOWER: [a-z];//'a' .. 'z';

我希望(现在)像这样解析文件:

I was hoping (for now) to parse files like this:

cmd1
cmd3 -a

如果我通过 grun 运行该输入,则会出现错误:

If I run that input through grun I get an error:

$ java org.antlr.v4.gui.TestRig Command commands -tree
...
`line 3:6 mismatched input 'a' expecting LOWER`

似乎 LOWER 应该匹配 'a'.如果我将 arg 定义更改为注释掉的行,它可以正常工作,并且我将-a"作为 arg.使用 LOWER 和明确使用a"有什么区别?

It seems like LOWER should match 'a'. If I change the arg definition to be the commented out line it works fine and I get the '-a' as an arg. What's the difference between using LOWER and using a 'a' explicitly?

推荐答案

一旦出现不匹配"错误,将 -tokens 添加到 grun 以显示令牌,它有助于找到差异在你认为词法分析器会做什么和它实际做什么之间.用你的语法:

As soon as you have a "mismatched" error, add -tokens to grun to display the tokens, it helps finding the discrepancy between what you THINK the lexer will do and what it actually DOES. With your grammar :

$ alias grun='java org.antlr.v4.gui.TestRig'
$ grun Command commands -tokens -diagnostics t.text
[@0,0:3='cmd1',<NAME>,1:0]
[@1,4:4='\n',<'
'>,1:4]
[@2,5:8='cmd3',<NAME>,2:0]
[@3,10:10='-',<'-'>,2:5]
[@4,11:11='a',<NAME>,2:6]
[@5,12:12='\n',<'
'>,2:7]
[@6,13:12='<EOF>',<EOF>,3:0]
line 2:6 mismatched input 'a' expecting LOWER

您立即看到字母 a 是一个 NAME 而不是预期的 LOWER.

you immediately see that the letter a is a NAMEand not the expected LOWER.

同时观察带有空选项的规则:

Also watch rules with an empty alternative :

args
    :   arg
    |
    ;

在某些情况下可能会导致问题.我更喜欢显式添加 ? 后缀,这意味着零次或一次.所以我的解决方案是:

may lead to problems in some circumstances. I prefer to explicitly add the ? suffix which means zero or one time. So my solution would be :

grammar Command;

commands
@init {System.out.println("Question last update 1829");}
    :   command+ EOF
    ; 

command
    :   NAME args? NL
    ;

args
    :   arg
    ;

arg : DASH? LOWER ;

LOWER : [a-z] ;
NAME  : [_a-zA-Z0-9]+;
DASH  : '-' ;
NL    : '\n' ;
WS    : [ \t\r]+ -> skip ;

执行:

$ grun Command commands -tokens -diagnostics t.text
[@0,0:3='cmd1',<NAME>,1:0]
[@1,4:4='\n',<'
'>,1:4]
[@2,5:8='cmd3',<NAME>,2:0]
[@3,10:10='-',<'-'>,2:5]
[@4,11:11='a',<LOWER>,2:6]
[@5,12:12='\n',<'
'>,2:7]
[@6,13:12='<EOF>',<EOF>,3:0]
Question last update 1829

这篇关于antlr4 无法将文字提取到令牌中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆