ANTLR:消除混乱 [英] ANTLR: removing clutter

查看:60
本文介绍了ANTLR:消除混乱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习ANTLR.假设我有一个VHDL代码,并且想对PROCESS块进行一些处理.其余的应该完全忽略.我不想描述整个VHDL语言,因为我只对流程块感兴趣.因此,我可以编写一个与流程块匹配的规则.但是我如何告诉ANTLR仅匹配流程块规则而忽略其他任何内容?

i'm learning ANTLR right now. Let's say, I have a VHDL code and would like to do some processing on the PROCESS blocks. The rest should be completely ignored. I don't want to describe the whole VHDL language, since I'm interested only in the process blocks. So I could write a rule that matches process blocks. But how do I tell ANTLR to match only the process block rule and ignore anything else?

推荐答案

我几乎不知道VHDL,所以假设您要用多行注释替换(Java)源文件中的所有单行注释:

I know next to no VHDL, so let's say you want to replace all single line comments in a (Java) source file with multi-line comments:

//foo

应成为:

/* foo */

当然,您需要让词法分析器匹配单行注释.但是,您还应确保它可以识别多行注释,因为您不希望在以下情况下将//bar识别为单行注释:

You need to let the lexer match single line comments, of course. But you should also make sure it recognizes multi-line comments because you don't want //bar to be recognized as a single line comment in:

/*
//bar
*/

字符串文字也是如此:

String s = "no // comment";

最后,您应该在词法分析器中创建某种与所有字符匹配的包罗万象的规则.

Finally, you should create some sort of catch-all rule in the lexer that will match any character.

快速演示:

grammar T;

parse
 : (t=. {System.out.print($t.text);})* EOF
 ;

Str
 : '"' ('\\' . | ~('\\' | '"'))* '"'
 ;

MLComment
 : '/*' .* '*/'
 ;

SLComment
 : '//' ~('\r' | '\n')*
   {
     setText("/* " + getText().substring(2) + " */");
   }
 ;

Any
 : . // fall through rule, matches any character
 ;

如果您现在这样分析输入内容:

If you now parse input like this:

//comment 1
class Foo {

  //comment 2

  /* 
   * not // a comment
   */
  String s = "not // a // comment"; //comment 3
}

以下内容将打印到您的控制台上:

the following will be printed to your console:

/* comment 1 */
class Foo {

  /* comment 2 */

  /* 
   * not // a comment
   */
  String s = "not // a // comment"; /* comment 3 */
}

请注意,这只是一个快速演示:Java中的字符串文字可能包含Unicode转义,我的演示不支持该转义,并且我的演示也不处理字符文字(字符文字char c = '"';会破坏它) ).当然,所有这些东西都很容易修复.

Note that this is just a quick demo: a string literal in Java could contain Unicode escapes, which my demo doesn't support, and my demo also does not handle char-literals (the char literal char c = '"'; would break it). All of these things are quite easy to fix, of course.

这篇关于ANTLR:消除混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆