使用ANTLR 3.3? [英] Using ANTLR 3.3?

查看:223
本文介绍了使用ANTLR 3.3?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想开始使用ANTLR和C#,但我发现它非常困难,由于缺乏文档/教程。我发现一对夫妇的半心半意的教程对旧版本,但似乎因为有过对API的一些重大变化。

谁能给我如何创建一个语法,并在短节目使用它呢?一个简单的例子

我终于成功地让我的语法文件编译成一个词法和语法分析器,我可以得到那些编译和Visual Studio中运行(无需重新编译ANTLR源之后,因为C#的二进制文件似乎是过时的太! - 更不用说源没有一些修正不会编译),但我仍然不知道该怎么做我的解析器/词法分析器类。据说它可以产生给予一定的输入端的AST ...然后我应该可以做一些花哨的这一点。


解决方案

比方说,你想简单的解析前pressions由以下标记:


  • - 减法(一元也);

  • + 此外,

  • * 乘法;

  • / 分工;

  • (...)分组(分)前pressions;

  • 整数和小数。

这是ANTLR语法可能看起来像这样:

 语法防爆pression;选项​​{
  语言= CSHARP2;
}解析
  :EXP EOF
  ;EXP
  :addExp
  ;addExp
  :mulExp((+| - )mulExp)*
  ;mulExp
  :unaryExp(('*'|'/')unaryExp)*
  ;unaryExp
  : - 原子
  |原子
  ;原子
  :号码
  | (EXP')'
  ;数
  ('0'..'9')+(('0'..'9')+'。')?
  ;

现在创建一个适当的AST,您添加输出= AST; 选项{...} 部分,你在你的语法定义令牌应该是树的根混合一些树运营商。有两种方法可以做到这一点:


  1. 添加 ^ 您的令牌后。在 ^ 将令牌变成一个根和 排除从AST令牌;!

  2. 使用重写规则: ... - > ^(根儿童儿童...)

乘坐规则例如:

 
  :TokenA TokenB TokenC TokenD
  ;

让我们说你要 TokenB 来成为根和 TokenA TokenC 来成为它的孩子,要排除 TokenD 从树上。以下是如何,使用选项1做的:

 
  :TokenA TokenB ^ TokenC TokenD!
  ;

和这里是如何做到这一点使用选项2:

 
  :TokenA TokenB TokenC TokenD - > ^(TokenB TokenA TokenC)
  ;

所以,这里的语法与树运营商在这:

 语法防爆pression;选项​​{
  语言= CSHARP2;
  输出= AST;
}令牌{
  根;
  UNARY_MIN;
}@parser ::名称空间是{} Demo.Antlr
@lexer ::名称空间是{} Demo.Antlr解析
  :EXP EOF - > ^(ROOT EXP)
  ;EXP
  :addExp
  ;addExp
  :mulExp(('+'|' - ')^ mulExp)*
  ;mulExp
  :unaryExp(('*'|'/')^ unaryExp)*
  ;unaryExp
  : - 原子 - > ^(UNARY_MIN原子)
  |原子
  ;原子
  :号码
  | (EXP) - > EXP
  ;数
  ('0'..'9')+(('0'..'9')+'。')?
  ;空间
  :(''|'\\ t'|'\\ r'|'\\ n'){跳过();}
  ;

我还添加了空格规则忽略源文件中的任何空格,并增加了对词法和语法分析器一些额外的标记和命名空间。请注意,顺序很重要(选项{...} ,然后再标记{...} 最后在 @ ... {} - 命名空间的声明)。

这就是它。

现在生成你的语法文件中的词法分析器和解析器:


java命令ANTLR-3.2.jar org.antlr.Tool防爆pression.g

和将的.cs 文件在您的项目一起 C#运行DLL的

您可以用下面的类测试:

 使用系统;
使用Antlr.Runtime;
使用Antlr.Runtime.Tree;
使用Antlr.StringTemplate;命名空间Demo.Antlr
{
  类MainClass
  {
    公共静态无效preorder(ITree树,诠释深度)
    {
      如果(树== NULL)
      {
        返回;
      }      的for(int i = 0; I<深度;我++)
      {
        Console.Write();
      }      Console.WriteLine(树);      preorder(Tree.GetChild(0),深度+ 1);
      preorder(Tree.GetChild(1),深度+ 1);
    }    公共静态无效的主要(字串[] args)
    {
      ANTLRStringStream输入=新ANTLRStringStream((12.5 + 56 / -7)* 0.5);
      防爆pressionLexer词法分析器=新的前pressionLexer(输入);
      CommonTokenStream令牌=新CommonTokenStream(词法);
      防爆pressionParser分析器=新的前pressionParser(令牌);
      防爆pressionParser.parse_return ParseReturn = Parser.parse();
      CommonTree树=(CommonTree)ParseReturn.Tree;
      preorder(树,0);
    }
  }
}

产生以下输出:



  *
    +
      12.5
      /
        56
        UNARY_MIN
          7
    0.5

其对应于以下AST:

(使用 graph.gafol.net 图创建)

注意ANTLR 3.3刚刚被释放,CSHARP的目标是测试版。这就是为什么我在我的例子中使用ANTLR 3.2。

在相当简单的语言时(如上面的示例所示),你也可以评估在飞行的结果,而无需创建一个AST。为此,您可以通过嵌入普通的C#code你的语法文件中,然后让你的语法规则返回一个特定的值。

下面是一个例子:

 语法防爆pression;选项​​{
  语言= CSHARP2;
}@parser ::名称空间是{} Demo.Antlr
@lexer ::名称空间是{} Demo.Antlr解析返回[双值]
  :EXP EOF {$值= $ exp.value;}
  ;EXP返回[双值]
  :addExp {$值= $ addExp.value;}
  ;addExp返回[双值]
  :A = mulExp {$值= $ a.value中;}
     ('+'B = mulExp {$值+ = $ b.value;}
     | ' - 'B = mulExp {$值 - = $ b.value;}
     )*
  ;mulExp返回[双值]
  :A = unaryExp {$值= $ a.value中;}
     ('*'B = unaryExp {$值* = $ b.value;}
     | '/'B = unaryExp {$值/ = $ b.value;}
     )*
  ;unaryExp返回[双值]
  : - 原子{$值= -1.0 * $ atom.value;}
  |原子{$值= $ atom.value;}
  ;原子返回[双值]
  :数{$值= Double.Parse($ Number.Text,CultureInfo.InvariantCulture);}
  | (EXP')'{$值= $ exp.value;}
  ;数
  ('0'..'9')+(('0'..'9')+'。')?
  ;空间
  :(''|'\\ t'|'\\ r'|'\\ n'){跳过();}
  ;

可与类进行测试:

 使用系统;
使用Antlr.Runtime;
使用Antlr.Runtime.Tree;
使用Antlr.StringTemplate;命名空间Demo.Antlr
{
  类MainClass
  {
    公共静态无效的主要(字串[] args)
    {
      字符串前pression =(12.5 + 56 / -7)* 0.5;
      ANTLRStringStream输入=新ANTLRStringStream(如pression);
      防爆pressionLexer词法分析器=新的前pressionLexer(输入);
      CommonTokenStream令牌=新CommonTokenStream(词法);
      防爆pressionParser分析器=新的前pressionParser(令牌);
      Console.WriteLine(例如pression +=+ Parser.parse());
    }
  }
}

和输出如下:


(12.5 + 56 / -7)* 0.5 = 2.25

修改


  

在评论中,拉夫写道:


  
  

提示对于那些使用Visual Studio:你可以把类似 java命令$(PROJECTDIR)ANTLR-3.2.jarorg.antlr.Tool$(PROJECTDIR)防爆pression .G在pre-生成事件,那么你可以修改你的语法和运行项目,而不必担心有关重建词法分析器/解析器。


I'm trying to get started with ANTLR and C# but I'm finding it extraordinarily difficult due to the lack of documentation/tutorials. I've found a couple half-hearted tutorials for older versions, but it seems there have been some major changes to the API since.

Can anyone give me a simple example of how to create a grammar and use it in a short program?

I've finally managed to get my grammar file compiling into a lexer and parser, and I can get those compiled and running in Visual Studio (after having to recompile the ANTLR source because the C# binaries seem to be out of date too! -- not to mention the source doesn't compile without some fixes), but I still have no idea what to do with my parser/lexer classes. Supposedly it can produce an AST given some input...and then I should be able to do something fancy with that.

解决方案

Let's say you want to parse simple expressions consisting of the following tokens:

  • - subtraction (also unary);
  • + addition;
  • * multiplication;
  • / division;
  • (...) grouping (sub) expressions;
  • integer and decimal numbers.

An ANTLR grammar could look like this:

grammar Expression;

options {
  language=CSharp2;
}

parse
  :  exp EOF 
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-') mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/') unaryExp)*
  ;

unaryExp
  :  '-' atom 
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' 
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Now to create a proper AST, you add output=AST; in your options { ... } section, and you mix some "tree operators" in your grammar defining which tokens should be the root of a tree. There are two ways to do this:

  1. add ^ and ! after your tokens. The ^ causes the token to become a root and the ! excludes the token from the ast;
  2. by using "rewrite rules": ... -> ^(Root Child Child ...).

Take the rule foo for example:

foo
  :  TokenA TokenB TokenC TokenD
  ;

and let's say you want TokenB to become the root and TokenA and TokenC to become its children, and you want to exclude TokenD from the tree. Here's how to do that using option 1:

foo
  :  TokenA TokenB^ TokenC TokenD!
  ;

and here's how to do that using option 2:

foo
  :  TokenA TokenB TokenC TokenD -> ^(TokenB TokenA TokenC)
  ;

So, here's the grammar with the tree operators in it:

grammar Expression;

options {
  language=CSharp2;
  output=AST;
}

tokens {
  ROOT;
  UNARY_MIN;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse
  :  exp EOF -> ^(ROOT exp)
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-')^ mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/')^ unaryExp)*
  ;

unaryExp
  :  '-' atom -> ^(UNARY_MIN atom)
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' -> exp
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

I also added a Space rule to ignore any white spaces in the source file and added some extra tokens and namespaces for the lexer and parser. Note that the order is important (options { ... } first, then tokens { ... } and finally the @... {}-namespace declarations).

That's it.

Now generate a lexer and parser from your grammar file:

java -cp antlr-3.2.jar org.antlr.Tool Expression.g

and put the .cs files in your project together with the C# runtime DLL's.

You can test it using the following class:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Preorder(ITree Tree, int Depth) 
    {
      if(Tree == null)
      {
        return;
      }

      for (int i = 0; i < Depth; i++)
      {
        Console.Write("  ");
      }

      Console.WriteLine(Tree);

      Preorder(Tree.GetChild(0), Depth + 1);
      Preorder(Tree.GetChild(1), Depth + 1);
    }

    public static void Main (string[] args)
    {
      ANTLRStringStream Input = new ANTLRStringStream("(12.5 + 56 / -7) * 0.5"); 
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      ExpressionParser.parse_return ParseReturn = Parser.parse();
      CommonTree Tree = (CommonTree)ParseReturn.Tree;
      Preorder(Tree, 0);
    }
  }
}

which produces the following output:

ROOT
  *
    +
      12.5
      /
        56
        UNARY_MIN
          7
    0.5

which corresponds to the following AST:

(diagram created using graph.gafol.net)

Note that ANTLR 3.3 has just been released and the CSharp target is "in beta". That's why I used ANTLR 3.2 in my example.

In case of rather simple languages (like my example above), you could also evaluate the result on the fly without creating an AST. You can do that by embedding plain C# code inside your grammar file, and letting your parser rules return a specific value.

Here's an example:

grammar Expression;

options {
  language=CSharp2;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse returns [double value]
  :  exp EOF {$value = $exp.value;}
  ;

exp returns [double value]
  :  addExp {$value = $addExp.value;}
  ;

addExp returns [double value]
  :  a=mulExp       {$value = $a.value;}
     ( '+' b=mulExp {$value += $b.value;}
     | '-' b=mulExp {$value -= $b.value;}
     )*
  ;

mulExp returns [double value]
  :  a=unaryExp       {$value = $a.value;}
     ( '*' b=unaryExp {$value *= $b.value;}
     | '/' b=unaryExp {$value /= $b.value;}
     )*
  ;

unaryExp returns [double value]
  :  '-' atom {$value = -1.0 * $atom.value;}
  |  atom     {$value = $atom.value;}
  ;

atom returns [double value]
  :  Number      {$value = Double.Parse($Number.Text, CultureInfo.InvariantCulture);}
  |  '(' exp ')' {$value = $exp.value;}
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

which can be tested with the class:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Main (string[] args)
    {
      string expression = "(12.5 + 56 / -7) * 0.5";
      ANTLRStringStream Input = new ANTLRStringStream(expression);  
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      Console.WriteLine(expression + " = " + Parser.parse());
    }
  }
}

and produces the following output:

(12.5 + 56 / -7) * 0.5 = 2.25

EDIT

In the comments, Ralph wrote:

Tip for those using Visual Studio: you can put something like java -cp "$(ProjectDir)antlr-3.2.jar" org.antlr.Tool "$(ProjectDir)Expression.g" in the pre-build events, then you can just modify your grammar and run the project without having to worry about rebuilding the lexer/parser.

这篇关于使用ANTLR 3.3?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆