Antlr:初学者期望ID的不匹配输入 [英] Antlr : beginner 's mismatched input expecting ID

查看:412
本文介绍了Antlr:初学者期望ID的不匹配输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为初学者,当我从 The Definitive学习ANTLR4时ANTLR 4参考书,我试图从第7章运行我修改过的练习版本:

As a beginner, when I was learning ANTLR4 from the The Definitive ANTLR 4 Reference book, I tried to run my modified version of the exercise from Chapter 7:

/**
 * to parse properties file
 * this example demonstrates using embedded actions in code
 */
grammar PropFile;

@header  {
    import java.util.Properties;
}
@members {
    Properties props = new Properties();
}
file
    : 
    {
        System.out.println("Loading file...");
    }
        prop+
    {
        System.out.println("finished:\n"+props);
    }
    ;

prop
    : ID '=' STRING NEWLINE 
    {
        props.setProperty($ID.getText(),$STRING.getText());//add one property
    }
    ;

ID  : [a-zA-Z]+ ;
STRING  :(~[\r\n])+; //if use  STRING : '"' .*? '"'  everything is fine
NEWLINE :   '\r'?'\n' ;

由于Java属性只是键值对,我使用 STRING 匹配除了 NEWLINE 之外的eveything(我不希望它只支持双引号中的字符串)。当运行以下句子时,我得到:

Since Java properties are just key-value pair I use STRING to match eveything except NEWLINE (I don't want it to just support strings in the double-quotes). When running following sentence, I got:

D:\Antlr\Ex\PropFile\Prop1>grun PropFile prop -tokens
driver=mysql
^Z
[@0,0:11='driver=mysql',<3>,1:0]
[@1,12:13='\r\n',<4>,1:12]
[@2,14:13='<EOF>',<-1>,2:14]
line 1:0 mismatched input 'driver=mysql' expecting ID

当我使用 STRING:'''。*?'''相反,它有效。

When I use STRING : '"' .*? '"' instead, it works.

我想知道哪里我错了,以便将来可以避免类似的错误。

I would like to know where I was wrong so that I can avoid similar mistakes in the future.

请给我一些建议,谢谢!

Please give me some suggestion, thank you!

推荐答案

由于ID和STRING都可以匹配以driver开头的输入文本,因此词法分析器将选择最长的匹配,即使ID规则首先出现。

Since both ID and STRING can match the input text starting with "driver", the lexer will choose the longest possible match, even though the ID rule comes first.

所以,你有几个选择。最直接的方法是通过要求字符串以等号开头来消除ID和STRING之间的歧义(这是您的替代方案的工作方式)。

So, you have several choices here. The most direct is to remove the ambiguity between ID and STRING (which is how your alternative works) by requiring the string to start with the equals sign.

file : prop+ EOF ;
prop : ID STRING NEWLINE ;

ID      : [a-zA-Z]+ ;
STRING  : '=' (~[\r\n])+;
NEWLINE : '\r'?'\n' ;

然后,您可以使用操作从字符串标记的文本中修剪等号。

You can then use an action to trim the equals sign from the text of the string token.

或者,您可以使用谓词来消除规则的歧义。

Alternately, you can use a predicate to disambiguate the rules.

file : prop+ EOF ;
prop : ID '=' STRING NEWLINE ;

ID      : [a-zA-Z]+ ;
STRING  : { isValue() }? (~[\r\n])+; 
NEWLINE : '\r'?'\n' ;

其中isValue方法在字符流上向后查看以验证它是否遵循等号。类似于:

where the isValue method looks backwards on the character stream to verify that it follows an equals sign. Something like:

@members {
public boolean isValue() {
    int offset = _tokenStartCharIndex;
    for (int idx = offset-1; idx >=0; idx--) {
        String s = _input.getText(Interval.of(idx, idx));
        if (Character.isWhitespace(s.charAt(0))) {
            continue;
        } else if (s.charAt(0) == '=') {
            return true;
        } else {
            break;
        }
    }
    return false;
}
}

这篇关于Antlr:初学者期望ID的不匹配输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆