使用ANTLR3将换行符，EOF解析为语句结束标记 [英] Parsing Newlines, EOF as End-of-Statement Marker with ANTLR3

查看：94 发布时间：2020/6/20 18:35:00 antlr antlr3 antlrworks

本文介绍了使用ANTLR3将换行符，EOF解析为语句结束标记的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的问题是关于在ANTLRWorks中运行以下语法:

My question is in regards to running the following grammar in ANTLRWorks:

INT :('0'..'9')+;
SEMICOLON: ';';
NEWLINE: ('\r\n'|'\n'|'\r');
STMTEND: (SEMICOLON (NEWLINE)*|NEWLINE+);

statement
    : STMTEND
    | INT STMTEND
    ;

program: statement+;

无论输入哪个换行符 NL (CR/LF/CRLF)或整数I，我都可以通过以下输入(以 program 作为开始规则)获得以下结果:选择:

I get the following results with the following input (with program as the start rule), regardless of which newline NL (CR/LF/CRLF) or integer I choose:

; NL "或"32; NL "解析没有错误. ;"或"45;" (不包含换行符)将导致EarlyExitException. " NL "本身解析没有错误. 没有分号的"456 NL "会导致MismatchedTokenException.

"; NL" or "32; NL" parses without error. ";" or "45;" (without newlines) result in EarlyExitException. "NL" by itself parses without error. "456 NL", without the semicolon, results in MismatchedTokenException.

我想要的是一个语句以换行符，分号或分号后接换行符来终止，并且我希望解析器在终止符处尽可能多地吃掉连续的换行符，所以; NL NL NL NL "只是一个终结点，而不是四个或五个.另外，我希望文件结尾的情况也可以是有效的终止，但是我还不知道该怎么做.

What I want is for a statement to be terminated by a newline, semicolon, or semicolon followed by newline, and I want the parser to eat as many contiguous newlines as it can on a termination, so "; NL NL NL NL" is just one termination, not four or five. Also, I would like the end-of-file case to be a valid termination as well, but I don't know how to do that yet.

那么这是怎么回事，如何使它在EOF终止呢?我对解析，ANTLR和EBNF完全陌生，并且在简单的计算器示例和参考之间的某个水平上，我没有发现太多要阅读的材料(我有The Definitive ANTLR Reference，但它确实是参考，在我之前还没有在ANTLRWorks之外快速入门过，因此，任何阅读建议(除了Wirth的1977 ACM论文)也将有所帮助.谢谢！

So what's wrong with this, and how can I make this terminate nicely at EOF? I'm completely new to all of parsing, ANTLR, and EBNF, and I haven't found much material to read on it at a level somewhere in between the simple calculator example and the reference (I have The Definitive ANTLR Reference, but it really is a reference, with a quick start in the front which I haven't yet got to run outside of ANTLRWorks), so any reading suggestions (besides Wirth's 1977 ACM paper) would be helpful too. Thanks!

推荐答案

在输入";"或"45;"的情况下，将永远不会创建令牌STMTEND.

In case of input like ";" or "45;", the token STMTEND will never be created.

";"将创建一个令牌:SEMICOLON，而"45;"将产生:INT SEMICOLON.

";" will create a single token: SEMICOLON, and "45;" will produce: INT SEMICOLON.

您(可能)想要的是SEMICOLON和NEWLINE从未真正成为真正的代币，但它们将始终是STMTEND.您可以通过使它们成为所谓的碎片"规则来做到这一点:

What you (probably) want is that SEMICOLON and NEWLINE never make it to real tokens themselves, but they will always be a STMTEND. You can do that by making them so called "fragment" rules:

program: statement+; statement : STMTEND | INT STMTEND ; INT : '0'..'9'+; STMTEND : SEMICOLON NEWLINE* | NEWLINE+; fragment SEMICOLON : ';'; fragment NEWLINE : '\r' '\n' | '\n' | '\r';

片段规则仅可用于其他词法分析器规则，因此它们永远不会出现在解析器(生产)规则中.要强调的是:上面的语法只会创建INT或STMTEND标记.

Fragment rules are only available for other lexer rules, so they will never end up in parser (production) rules. To emphasize: the grammar above will only ever create either INT or STMTEND tokens.

这篇关于使用ANTLR3将换行符，EOF解析为语句结束标记的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用ANTLR3将换行符，EOF解析为语句结束标记 [英] Parsing Newlines, EOF as End-of-Statement Marker with ANTLR3

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用ANTLR3将换行符，EOF解析为语句结束标记 [英] Parsing Newlines, EOF as End-of-Statement Marker with ANTLR3

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭