Antlr4:不匹配的输入 [英] Antlr4: Mismatched input
问题描述
这是一个我认为很容易解析的简单语法测试,但我立即得到了不匹配的输入",我无法弄清楚 Antlr 正在寻找什么.
Here's a simple grammar test I thought would be easy to parse, but I get 'mismatched input' right off the bat and I can't figure out what Antlr is looking for.
输入:
# include "something" program TEST1 { BLAH BLAH }
我的语法:
grammar ProgHeader;
program: header* prog EOF ;
header: '#' ( include | define ) ;
include: 'include' string ;
define: 'define' string string? ;
string: '"' QTEXT '"' ;
prog: 'program' QTEXT '{' BLOCK '}' ;
QTEXT: ~[\r\n\"]+ ;
BLOCK: ~[}]+ ; // don't care, example block
WS: [ \t\r\n] -> skip ;
输出错误信息:
line 1:0 mismatched input '# include "something" program TEST1 { BLAH BLAH '
expecting {'program', '#'}
这真的让我很困惑,因为它说它正在寻找一个 '#' 并且在输入的开头有一个.我也抛弃了解析树.它似乎卡在顶部,在程序"规则中:
This really confuses me because it says it's looking for a '#' and there's one right at the start of the input. I dumped the parse tree too. It appears to be stuck right at the top, at the 'program' rule:
(program # include "something" program TEST1 { BLAH BLAH } )
哈?
这是驱动这个测试用例的完整程序,如果它很重要(我认为它不重要,上面的信息已经足够了,但在这里):
Here's the full program driving this test case if it matters (I don't think it should matter, the above info is enough, but here it is):
package antlrtests;
import antlrtests.grammars.*;
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
/**
*
* @author Brenden Towey
*/
public class ProgHeaderTest {
private String[] testVectors = {
"# include \"something\" program TEST1 { BLAH BLAH } ",
};
public void runTests() {
for( String test : testVectors )
simpleTest( test );
}
private void simpleTest( String test ) {
ANTLRInputStream ains = new ANTLRInputStream( test );
ProgHeaderLexer wpl = new ProgHeaderLexer( ains );
CommonTokenStream tokens = new CommonTokenStream( wpl );
ProgHeaderParser wikiParser = new ProgHeaderParser( tokens );
ParseTree parseTree = wikiParser.program();
System.out.println( "'" + test + "': " + parseTree.toStringTree(
wikiParser ) );
}
}
以及完整的输出:
run:
line 1:0 mismatched input '# include "something" program TEST1 { BLAH BLAH ' expecting {'program', '#'}
'# include "something" program TEST1 { BLAH BLAH } ': (program # include "something" program TEST1 { BLAH BLAH } )
BUILD SUCCESSFUL (total time: 0 seconds)
推荐答案
最开始匹配的最长token是QTEXT,匹配文本#include
(文本最多但不包括第一个 "
字符),但在这一点上的有效标记是 'program' 和 '#',正如报告的那样.所以最好避免匹配几乎任何东西的标记定义.
The longest token that matches at the very beginning is QTEXT, which matches the text # include
(the text up to but not including the first "
character), but valid tokens at that point are 'program' and '#', as reported. So better avoid token definitions that match almost anything.
这篇关于Antlr4:不匹配的输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!