计数类计数的部分语法 [英] Partial grammar for counting class count

查看:29
本文介绍了计数类计数的部分语法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要计算正确的 C# 源文件中的类数.我写了以下语法:

I need to count the number of classes in correct C# source file. I wrote the following grammar:

grammar CSharpClassGrammar;

options
{
        language=CSharp2;

}

@parser::namespace { CSharpClassGrammar.Generated }
@lexer::namespace  { CSharpClassGrammar.Generated }

@header
{
        using System;
        using System.Collections.Generic;

}

@members
{
        private List<string> _classCollector = new List<string>();
        public List<string> ClassCollector { get { return
_classCollector; } }

}

/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/

csfile  : class_declaration* EOF
        ;

class_declaration
        : (ACCESSLEVEL | MODIFIERS)* PARTIAL? 'class' CLASSNAME
          class_body
          ';'?
          { _classCollector.Add($CLASSNAME.text); }
        ;

class_body
        : '{' class_declaration* '}'
        ;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/

ACCESSLEVEL
        : 'public' | 'internal' | 'protected' | 'private' | 'protected
internal'
        ;

MODIFIERS
        : 'static' | 'sealed' | 'abstract'
        ;

PARTIAL
        : 'partial'
        ;

CLASSNAME
        : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
        ;

COMMENT
        : '//' ~('\n'|'\r')* {$channel=HIDDEN;}
        |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
        ;

WHITESPACE
        : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; }
        ; 

这个解析器正确计算空类(以及嵌套类)和空类主体:

This parser correctly count empty classes (and nested classes too) with empty class-body:

internal class DeclarationClass1
{
    class DeclarationClass2
    {
        public class DeclarationClass3
        {
            abstract class DeclarationClass4
            {
            }
        }
    }
}

我需要计算没有空体的类,例如:

I need to count classes with not empty body, such as:

class TestClass
{
    int a = 42;

    class Nested { }
}

我需要以某种方式忽略所有不是类声明"的代码.在上面的例子中忽略

I need to somehow ignore all the code that is "not a class declaration". In the example above ignore

int a = 42;

我该怎么做?可以作为其他语言的例子吗?
请帮忙!

How can I do this? May be example for other language?
Please, help!

推荐答案

当您只对源文件的某些部分感兴趣时,您可以在 options 中设置 filter=true{ ... } 部分.这将使您能够仅定义您感兴趣的标记,而您未定义的将被词法分析器忽略.

When you're only interested in certain parts of a source file, you could set filter=true in your options { ... } sections. This will enable you to only define those tokens you're interested in, and what you don't define, is ignored by the lexer.

请注意,这仅适用于词法分析器语法,不适用于组合(或解析器)语法.

Note that this only works with lexer grammars, not in combined (or parser) grammars.

一个小演示:

lexer grammar CSharpClassLexer;

options {
  language=CSharp2;
  filter=true;
}

@namespace { Demo }

Comment
  :  '//' ~('\r' | '\n')*
  |  '/*' .* '*/'
  ;

String
  :  '"' ('\\' . | ~('"' | '\\' | '\r' | '\n'))* '"'
  |  '@' '"' ('"' '"' | ~'"')* '"'
  ;

Class
  :  'class' Space+ Identifier 
     {Console.WriteLine("Found class: " + $Identifier.text);}
  ;

Space
  :  ' ' | '\t' | '\r' | '\n'
  ;

Identifier
  :  ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*
  ;

Identifier 留在那里很重要,因为您不希望 Xclass Foo 被标记为:['X', 'class','福'].有了 IdentifierXclass 将成为整个标识符.

It's important you leave the Identifier in there because you don't want Xclass Foo to be tokenized as: ['X', 'class', 'Foo']. With the Identifier in there, Xclass will become the entire identifier.

可以使用以下类测试语法:

The grammar can be tested with the following class:

using System;
using Antlr.Runtime;

namespace Demo
{
    class MainClass
    {
        public static void Main (string[] args)
        {
            string source = 
@"class TestClass
{
    int a = 42;

    string _class = ""inside a string literal: class FooBar {}..."";

    class Nested { 
        /* class NotAClass {} */

        // class X { }

        class DoubleNested {
            string str = @""
                multi line string 
                class Bar {}
            "";
        }
    }
}";
            Console.WriteLine("source=\n" + source + "\n-------------------------");
            ANTLRStringStream Input = new ANTLRStringStream(source);
            CSharpClassLexer Lexer = new CSharpClassLexer(Input);
            CommonTokenStream Tokens = new CommonTokenStream(Lexer);
            Tokens.GetTokens();
        }
    }
}

产生以下输出:

source=
class TestClass
{
    int a = 42;

    string _class = "inside a string literal: class FooBar {}...";

    class Nested { 
        /* class NotAClass {} */

        // class X { }

        class DoubleNested {
            string str = @"
                multi line string 
                class Bar {}
            ";
        }
    }
}
-------------------------
Found class: TestClass
Found class: Nested
Found class: DoubleNested

请注意,这只是一个快速演示,我不确定我是否在语法中处理了正确的字符串文字(我不熟悉 C#),但是这个演示应该让您有一个开始.

Note that this is just a quick demo, I am not sure if I handled the proper string literals in the grammar (I am unfamiliar with C#), but this demo should give you a start.

祝你好运!

这篇关于计数类计数的部分语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆