Antlr:初学者的不匹配输入期望 ID [英] Antlr : beginner 's mismatched input expecting ID

查看:31
本文介绍了Antlr:初学者的不匹配输入期望 ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为初学者,当我从 The Definitive 学习 ANTLR4 时ANTLR 4 Reference 一本书,我尝试运行我修改后的第 7 章练习:

As a beginner, when I was learning ANTLR4 from the The Definitive ANTLR 4 Reference book, I tried to run my modified version of the exercise from Chapter 7:

/**
 * to parse properties file
 * this example demonstrates using embedded actions in code
 */
grammar PropFile;

@header  {
    import java.util.Properties;
}
@members {
    Properties props = new Properties();
}
file
    : 
    {
        System.out.println("Loading file...");
    }
        prop+
    {
        System.out.println("finished:\n"+props);
    }
    ;

prop
    : ID '=' STRING NEWLINE 
    {
        props.setProperty($ID.getText(),$STRING.getText());//add one property
    }
    ;

ID  : [a-zA-Z]+ ;
STRING  :(~[\r\n])+; //if use  STRING : '"' .*? '"'  everything is fine
NEWLINE :   '\r'?'\n' ;

由于 Java 属性只是键值对,所以我使用 STRING 来匹配除 NEWLINE 之外的所有内容(我不希望它只支持双引号中的字符串).运行以下句子时,我得到:

Since Java properties are just key-value pair I use STRING to match eveything except NEWLINE (I don't want it to just support strings in the double-quotes). When running following sentence, I got:

D:\Antlr\Ex\PropFile\Prop1>grun PropFile prop -tokens
driver=mysql
^Z
[@0,0:11='driver=mysql',<3>,1:0]
[@1,12:13='\r\n',<4>,1:12]
[@2,14:13='<EOF>',<-1>,2:14]
line 1:0 mismatched input 'driver=mysql' expecting ID

当我使用 STRING : '"' .*? '"' 时,它可以工作.

When I use STRING : '"' .*? '"' instead, it works.

我想知道我哪里错了,以免以后再犯类似的错误.

I would like to know where I was wrong so that I can avoid similar mistakes in the future.

请给我一些建议,谢谢!

Please give me some suggestion, thank you!

推荐答案

由于 ID 和 STRING 都可以匹配以driver"开头的输入文本,因此词法分析器将选择可能的最长匹配项,即使 ID 规则在前.

Since both ID and STRING can match the input text starting with "driver", the lexer will choose the longest possible match, even though the ID rule comes first.

因此,您在这里有多种选择.最直接的方法是通过要求字符串以等号开头来消除 ID 和 STRING 之间的歧义(这是您的替代方法).

So, you have several choices here. The most direct is to remove the ambiguity between ID and STRING (which is how your alternative works) by requiring the string to start with the equals sign.

file : prop+ EOF ;
prop : ID STRING NEWLINE ;

ID      : [a-zA-Z]+ ;
STRING  : '=' (~[\r\n])+;
NEWLINE : '\r'?'\n' ;

然后,您可以使用操作从字符串标记的文本中修剪等号.

You can then use an action to trim the equals sign from the text of the string token.

或者,您可以使用谓词来消除规则的歧义.

Alternately, you can use a predicate to disambiguate the rules.

file : prop+ EOF ;
prop : ID '=' STRING NEWLINE ;

ID      : [a-zA-Z]+ ;
STRING  : { isValue() }? (~[\r\n])+; 
NEWLINE : '\r'?'\n' ;

其中 isValue 方法向后查看字符流以验证它是否遵循等号.类似的东西:

where the isValue method looks backwards on the character stream to verify that it follows an equals sign. Something like:

@members {
public boolean isValue() {
    int offset = _tokenStartCharIndex;
    for (int idx = offset-1; idx >=0; idx--) {
        String s = _input.getText(Interval.of(idx, idx));
        if (Character.isWhitespace(s.charAt(0))) {
            continue;
        } else if (s.charAt(0) == '=') {
            return true;
        } else {
            break;
        }
    }
    return false;
}
}

这篇关于Antlr:初学者的不匹配输入期望 ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆