ANTLR3语法与谓词不匹配规则 [英] ANTLR3 grammar does not match rule with predicate

查看:117
本文介绍了ANTLR3语法与谓词不匹配规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个组合的语法,需要提供两个标识符词法分析器规则. 两个标识符可以同时使用.在语法上,Identifier1在Identifer2之前.

I have a combined grammar where I need to provide for two identifier lexer rules. Both identifiers can be used at the same time. Identifier1 comes before Identifer2 in grammar.

第一个标识符是静态的,而第二个标识符规则根据某个标志而改变(使用谓词).

First identifier is static, whereas second identifier rule changes on the basis of some flag.(Using predicate).

我希望第二个标识符在解析器规则中匹配.但是由于这两个标识符可能与某些常用输入匹配,因此它不会属于identifer2.

I want the second identifier to match in parser rules. But as both identifiers may match some common inputs, It does not fall on identifer2.

我创建了小语法以使其易于理解.语法为:

I have created small grammar to make it understandable. Grammar is as:

@lexer::members
{
  private boolean flag;

  public void setFlag(boolean flag)
  {
    this.flag = flag;
  }
}


identifier1 :
 ID1
 ;

identifier2 :
ID2
; 


ID1 : (CHARS) *;


ID2 : (CHARS | ({flag}? '_'))* ;


fragment CHARS 
: 
  ('a' .. 'z')
;  

如果我尝试将identifer2规则匹配为:

If I try to match identifer2 rule as :

    ANTLRStringStream in = new ANTLRStringStream("abcabde");
    IdTestLexer lexer = new IdTestLexer(in);
    lexer.setFlag(true);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    IdTestParser parser = new IdTestParser(tokens);
    parser.identifier2();

它显示错误: 第1:0行在"abcabde"处缺少ID2

It shows error: line 1:0 missing ID2 at 'abcabde'

推荐答案

ID1 : (CHARS) *;
ID2 : (CHARS | ({flag}? '_'))* ;

对于ANTLR,这两个规则意味着:

For ANTLR these two rules mean:

  • 如果输入只是字符,则为ID1
  • 如果输入中混合了字符和_flag == true,则为ID2
  • If the input is just characters, it's ID1
  • If the input mixes characters and _ and flag == true, it's ID2

请注意,如果flag == false,则ID2将永远不会被匹配.

Note that if flag == false, ID2 will never be matched.

Lexer遵循的两个基本规则是:

The two basic rules the Lexer follows are:

  • 它与覆盖最长输入子序列的令牌匹配
  • 如果多个标记可以匹配相同的输入,请使用语法中最先出现的标记

我相信您的核心问题是误解词法分析器和解析器之间的区别以及它们的用法.您应该问自己的问题是:何时应将'abcabde'匹配为ID1,何时应匹配为ID2?

I believe your core issue is misunderstanding the difference between lexer and parser and their usage. The question you should ask yourself is: When should 'abcabde' be matched as ID1 and when as ID2?

  • 总是ID1-那么您的语法是正确的.
  • 总是ID2-然后您应该切换两个规则-但请注意,在这种情况下ID1将永远不匹配.
  • 这取决于flag-然后您需要根据逻辑修改谓词,仅切换下划线是不够的.
  • 这取决于标识符在输入中的使用位置-但这不是lexer可以决定的事情,您需要在解析器中而不是在lexer中区分这两种标识符.正式地,当您需要常规语言. wikipedia.org/wiki/Context-free_language"rel =" nofollow noreferrer>无上下文语言来决定类似的标识符.
  • Always ID1 - then your grammar is correct as it is now.
  • Always ID2 - then you should switch the two rules - but note that in such case ID1 will never be matched.
  • It depends on flag - then you need to modify the predicate according to your logic, just toggling the underscore isn't enough.
  • It depends on where in the input the identifier is used - then this is not something that lexer can decide, and you need to tell the two kinds of identifiers apart in parser rather than lexer. Formally, lexer uses regular language while you need context-free language to decide about the identifiers like that.

这篇关于ANTLR3语法与谓词不匹配规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆