解析器规则和词法分析器规则之间的替代 [英] Alternative between parser rule and lexer rule

查看:30
本文介绍了解析器规则和词法分析器规则之间的替代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

该问题被埋在另一个问题的更新部分,现在特地问一下.

我使用的是 antlr3.4.

我有一个简单的语法试图解析 2 种类型的文本,该行以#include"和其他开头.这是我的语法:

cmds: cmd+;指令:include_cmd |其他_cmd;包含_cmd: 包括 DOUBLE_QUOTE FILE_NAME DOUBLE_QUOTE;其他_cmd:(~包括)+;包括: '#包括';DOUBLE_QUOTE:'"';文档名称: ('a'..'z' | 'A'..'Z' | '0'..'9' | '_')+;新队: ('\r' | '\n')+;WS: ('\t' | ' ')+ {$channel = HIDDEN;};

但我收到这样的警告:

Decision 可以使用多个替代项匹配输入,例如{DOUBLE_QUOTE..FILE_NAME, New_Line..WS}":1, 2因此,该输入禁用了备选方案 2

我猜这是因为双引号可以同时匹配 other_cmd 规则和 DOUBLE_QUOTE 规则,但我在这里想知道,一个是解析器规则,另一个是词法分析器规则,这个警告有意义吗?

对清除此警告有帮助吗?

附带问题 - 警告消息只是说替代方案 1,2,但我不清楚什么是 1,什么是 2,有没有办法让 antlr 提供更直接的替代方案?

解决方案

我猜这是因为双引号可以同时匹配 other_cmd 规则和 DOUBLE_QUOTE 规则,...

不,这不是问题,因为 include_cmdother_cmd 无法匹配的内容开头.

<块引用>

Decision 可以匹配输入,例如{DOUBLE_QUOTE..FILE_NAME, New_Line..WS}",使用多个替代项:1, 2

警告意味着像 foo"(一个 FILE_NAME 后跟一个 DOUBLE_QUOTE)这样的输入可以被解析器匹配到不止一个方式:

1.贪婪

2.不贪心

ANTLR 将选择贪婪解析,但由于可能出现不贪婪,因此会生成警告.如果您明确告诉解析器进行贪婪匹配,则不会再发出警告:

other_cmd:(选项 {greedy=true;} :~INCLUDE)+;

<块引用>

附带问题 - 警告消息只是说替代方案 1,2,但我不清楚什么是 1,什么是 2,有没有办法让 antlr 提供更直接的替代方案?

不,据我所知没有.这个警告确实相当神秘.替代项通常表示解析器可以遵循的分支:

parser_rule:替代_1|替代_2|替代_3;

但在您的情况下,似乎 ANTLR 正在谈论令牌范围是替代方案:DOUBLE_QUOTE..FILE_NAME 是替代方案,而 New_Line..WS 是第二个.

The question is buried in the update section of another question, now specifically ask it.

I am using antlr3.4.

I have a simple grammar trying to parse 2 types of text, the line started with "#include" and others. Here is my grammar:

cmds
    : cmd+
    ;

cmd
    : include_cmd |  other_cmd
    ;

include_cmd
    : INCLUDE  DOUBLE_QUOTE  FILE_NAME  DOUBLE_QUOTE
    ;

other_cmd
    : (~INCLUDE)+
    ;


INCLUDE
    : '#include'
    ;

DOUBLE_QUOTE
    : '"'
    ;

FILE_NAME
    : ('a'..'z' | 'A'..'Z' | '0'..'9' | '_')+
    ;

New_Line 
    : ('\r' | '\n')+   
    ;

WS 
    : ('\t' | ' ')+  {$channel = HIDDEN;}
    ;

But I get such warning:

Decision can match input such as "{DOUBLE_QUOTE..FILE_NAME, New_Line..WS}" using multiple alternatives: 1, 2

As a result, alternative(s) 2 were disabled for that input

I guess this is because a double quote can match both other_cmd rule and DOUBLE_QUOTE rule, but I am wondering here, one is parser rule and the other is lexer rule, does this warning make sense?

Any help to clear this warning?

A side question - the warning message just says alternative 1,2, but it is not immediately clear to me what is 1 and what is 2, is there a way to render antlr to give more direct alternatives?

解决方案

I guess this is because a double quote can match both other_cmd rule and DOUBLE_QUOTE rule, ...

No, that is not the issue, since include_cmd starts with something that other_cmd cannot match.

Decision can match input such as "{DOUBLE_QUOTE..FILE_NAME, New_Line..WS}" using multiple alternatives: 1, 2

The warning means that input like foo" (a FILE_NAME followed by a DOUBLE_QUOTE) can be matched by the parser in more than one way:

1. greedy

2. ungreedy

ANTLR will choose the greedy parse, but since an ungreedy is possible, a warning is generated. If you explicitly tell the parser to match greedily, the warning would not be issued anymore:

other_cmd
 : (options {greedy=true;} : ~INCLUDE)+
 ;

A side question - the warning message just says alternative 1,2, but it is not immediately clear to me what is 1 and what is 2, is there a way to render antlr to give more direct alternatives?

No, not as far as I know. This warning is indeed rather cryptic. Alternatives usually denote the branches the parser can follow:

parser_rule
 : alternative_1 
 | alternative_2
 | alternative_3
 ;

But in your case, it seems ANTLR is talking about token ranges being the alternatives: DOUBLE_QUOTE..FILE_NAME being an alternative and New_Line..WS being the 2nd.

这篇关于解析器规则和词法分析器规则之间的替代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆