解析注释行 [英] Parse comment line

查看：239 发布时间：2016/12/21 10:37:08 line antlr grammar comments

本文介绍了解析注释行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给定以下基本语法我想了解如何处理注释行。缺少通常终止注释行的< CR>< LF> 的处理 - 唯一的例外是EOF之前的最后注释行， g。：

 ＃comment 
 abcd：= 12; 
＃comment eof without< CR>< LF>

  grammar CommentLine1a; 
 
 // ======================================== ================== 
 //选项
 // ===================== ===================================== 
 
 
 
 // ======================================= =============== 
 // Lexer规则
 // ======================= =================================== 
 
 Int 
 ：Digit + 
; 
 
片段数字
：'0'..'9'
; 
 
 ID_NoDigitStart 
：（'a'..'z'|'A'..'Z'）（'a'..'z'|'A'..'Z '| Digit）* 
; 
 
空格
：（''|'\t'|'\r'|'\\\
'）+ {$ channel = HIDDEN; } 
; 
 
 
 // ==================================== ==================== 
 //解析器规则
 // ============== ======================================== 
 
 code 
：（assignment | comment）+ 
; 
 
赋值
：id_NoDigitStart'：='id_DigitStart';'
; 
 
 id_NoDigitStart 
：ID_NoDigitStart 
; 
 
 id_DigitStart 
：（ID_NoDigitStart | Int）+ 
; 
 
 comment 
：'＃'〜（'\r'|'\\\
'）* 
;

解决方案

除非你有一个非常令人信服的理由在解析器（我想听到），你应该把它放在词法分析器：

 注释
 ：'＃'〜（'\r'|'\\\
'）* 
;

由于您已在 Space 规则，没有< CR>< LF>

此外，如果在解析器规则中使用文字标记，ANTLR会在幕后自动创建词典规则。所以在你的情况下：

  comment 
：'＃'〜（'\r'|'\\\
 '）* 
;

会匹配'＃'除了'\r'和'\\\ '之外的零个或多个代币和之外的零个或多个字符（'\r'和'\

 
 
 内部解析器规则
 
 
  
  〜否定凭证
 
  。匹配任何标记
 
 
 
 
 内部词法规则
 
 $ b b  
  〜否定字符
 
  。匹配范围 0x0000 中的任何字符...  0xFFFF  
 
 
 
Given the following basic grammar I want to understand how I can handle comment lines. Missing is the handling of the <CR><LF> which usually terminates the comment line - the only exception is a last comment line before the EOF, e. g.:
# comment
abcd := 12 ;
# comment eof without <CR><LF>

grammar CommentLine1a;

//==========================================================
// Options
//==========================================================



//==========================================================
// Lexer Rules
//==========================================================

Int
  : Digit+
  ;

fragment Digit
  : '0'..'9'
  ;

ID_NoDigitStart
  : ( 'a'..'z' | 'A'..'Z' ) ('a'..'z' | 'A'..'Z' | Digit )*
  ;

Whitespace
  : ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN ; }
  ; 


//==========================================================
// Parser Rules
//==========================================================

code
  : ( assignment | comment )+
  ;

assignment
  : id_NoDigitStart ':=' id_DigitStart ';'
  ;

id_NoDigitStart
  : ID_NoDigitStart
  ;  

id_DigitStart
  : ( ID_NoDigitStart | Int )+
  ;

comment
  : '#' ~( '\r' | '\n' )*
  ;

 解决方案 
Unless you have a very compelling reason to put the comment inside the parser (which I'd like to hear), you should put it in the lexer:
Comment
  :  '#' ~( '\r' | '\n' )*
  ;
And since you already account for line breaks in your Space rule, there's no problem with input like # comment eof without <CR><LF> 

Also, if you use literal tokens inside parser rules, ANTLR automatically creates lexer rules of them behind the scenes. So in your case:
comment
  :  '#' ~( '\r' | '\n' )*
  ;
would match a '#' followed by zero or more tokens other than '\r' and '\n' and not zero or more characters other than '\r' and '\n'.

For future reference:

Inside parser rules


~ negates tokens
. matches any token


Inside lexer rules


~ negates characters
. matches any character in the range 0x0000 ... 0xFFFF


                        这篇关于解析注释行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

解析注释行 [英] Parse comment line

问题描述

内部解析器规则

内部词法规则

Inside parser rules

Inside lexer rules

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

解析注释行 [英] Parse comment line

问题描述

内部解析器规则

内部词法规则

Inside parser rules

Inside lexer rules

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭