使用 ANTLR 3.3? [英] Using ANTLR 3.3?

查看:20
本文介绍了使用 ANTLR 3.3?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试开始使用 ANTLR 和 C#,但由于缺乏文档/教程,我发现它非常困难.我发现了一些针对旧版本的半心半意的教程,但自那以后 API 似乎发生了一些重大变化.

谁能给我一个简单的例子来说明如何创建语法并在短程序中使用它?

我终于设法将我的语法文件编译成词法分析器和解析器,并且我可以在 Visual Studio 中编译和运行这些文件(在必须重新编译 ANTLR 源代码之后,因为 C# 二进制文件似乎也已过时)!——更不用说源代码在没有一些修复的情况下无法编译),但我仍然不知道如何处理我的解析器/词法分析器类.据说它可以在给定一些输入的情况下生成一个 AST……然后我应该可以用它做一些有趣的事情.

解决方案

假设您想解析由以下标记组成的简单表达式:

  • - 减法(也是一元的);
  • + 加法;
  • * 乘法;
  • / 划分;
  • (...) 分组(子)表达式;
  • 整数和十进制数.

ANTLR 语法可能如下所示:

语法表达;选项 {语言=CSharp2;}解析: exp EOF;经验值: addExp;添加Exp: mulExp (('+' | '-') mulExp)*;倍数: unaryExp (('*' | '/') unaryExp)*;一元Exp: '-' 原子|原子;原子:  数字|'('exp')';数字: ('0'..'9')+ ('.' ('0'..'9')+)?;

现在要创建一个合适的 AST,您在 options { ... } 部分中添加 output=AST;,并在您的语法定义哪些标记应该是树的根.有两种方法可以做到这一点:

  1. 在您的令牌后添加 ^!.^ 使令牌成为根,! 将令牌从 ast 中排除;
  2. 通过使用重写规则":... ->^(Root Child Child ...).

以规则foo为例:

foo: 代币A 代币B 代币C 代币D;

假设您希望 TokenB 成为根,TokenATokenC 成为其子代,并且您想要排除 TokenD 来自树.以下是使用选项 1 执行此操作的方法:

foo: TokenA TokenB^ TokenC TokenD!;

这里是如何使用选项 2 做到这一点:

foo: TokenA TokenB TokenC TokenD ->^(TokenB TokenA TokenC);

所以,这是包含树运算符的语法:

语法表达;选项 {语言=CSharp2;输出=AST;}令牌{根;UNARY_MIN;}@parser::namespace { Demo.Antlr }@lexer::namespace { Demo.Antlr }解析: exp EOF ->^(ROOT exp);经验值: addExp;添加Exp: mulExp (('+' | '-')^ mulExp)*;倍数: unaryExp (('*' | '/')^ unaryExp)*;一元Exp: '-' 原子 ->^(UNARY_MIN 原子)|原子;原子:  数字|'('exp')' ->经验值;数字: ('0'..'9')+ ('.' ('0'..'9')+)?;空间: (' ' | '\t' | '\r' | '\n'){跳过();};

我还添加了一个 Space 规则来忽略源文件中的任何空格,并为词法分析器和解析器添加了一些额外的标记和命名空间.请注意,顺序很重要(首先是 options { ... },然后是 tokens { ... },最后是 @... {}代码>-命名空间声明).

就是这样.

现在从你的语法文件中生成一个词法分析器和解析器:

<前>java -cp antlr-3.2.jar org.antlr.Tool Expression.g

并将 .cs 文件与 创建的图表)

请注意,ANTLR 3.3 刚刚发布,CSharp 目标处于测试阶段".这就是我在示例中使用 ANTLR 3.2 的原因.

对于相当简单的语言(如我上面的示例),您还可以在不创建 AST 的情况下即时评估结果.您可以通过在语法文件中嵌入纯 C# 代码并让解析器规则返回特定值来实现这一点.

这是一个例子:

语法表达;选项 {语言=CSharp2;}@parser::namespace { Demo.Antlr }@lexer::namespace { Demo.Antlr }解析返回 [双值]: exp EOF {$value = $exp.value;};exp 返回 [double value]: addExp {$value = $addExp.value;};addExp 返回 [double value]: a=mulExp {$value = $a.value;}( '+' b=mulExp {$value += $b.value;}|'-' b=mulExp {$value -= $b.value;})*;mulExp 返回 [double value]: a=unaryExp {$value = $a.value;}( '*' b=unaryExp {$value *= $b.value;}|'/' b=unaryExp {$value/= $b.value;})*;unaryExp 返回 [double value]: '-' 原子 {$value = -1.0 * $atom.value;}|原子 {$value = $atom.value;};原子返回 [double value]: Number {$value = Double.Parse($Number.Text, CultureInfo.InvariantCulture);}|'(' exp ')' {$value = $exp.value;};数字: ('0'..'9')+ ('.' ('0'..'9')+)?;空间: (' ' | '\t' | '\r' | '\n'){跳过();};

可以用类进行测试:

使用系统;使用 Antlr.Runtime;使用 Antlr.Runtime.Tree;使用 Antlr.StringTemplate;命名空间 Demo.Antlr{主类{public static void Main (string[] args){字符串表达式 = "(12.5 + 56/-7) * 0.5";ANTLRStringStream Input = new ANTLRStringStream(expression);ExpressionLexer Lexer = new ExpressionLexer(Input);CommonTokenStream Tokens = new CommonTokenStream(Lexer);ExpressionParser Parser = new ExpressionParser(Tokens);Console.WriteLine(表达式 + " = " + Parser.parse());}}}

并产生以下输出:

<前>(12.5 + 56/-7) * 0.5 = 2.25

编辑

<块引用>

在评论中,拉尔夫写道:

给那些使用 Visual Studio 的提示:你可以输入类似 java -cp "$(ProjectDir)antlr-3.2.jar" org.antlr.Tool "$(ProjectDir)Expression.g"在预构建事件中,您只需修改语法并运行项目,而不必担心重新构建词法分析器/解析器.

I'm trying to get started with ANTLR and C# but I'm finding it extraordinarily difficult due to the lack of documentation/tutorials. I've found a couple half-hearted tutorials for older versions, but it seems there have been some major changes to the API since.

Can anyone give me a simple example of how to create a grammar and use it in a short program?

I've finally managed to get my grammar file compiling into a lexer and parser, and I can get those compiled and running in Visual Studio (after having to recompile the ANTLR source because the C# binaries seem to be out of date too! -- not to mention the source doesn't compile without some fixes), but I still have no idea what to do with my parser/lexer classes. Supposedly it can produce an AST given some input...and then I should be able to do something fancy with that.

解决方案

Let's say you want to parse simple expressions consisting of the following tokens:

  • - subtraction (also unary);
  • + addition;
  • * multiplication;
  • / division;
  • (...) grouping (sub) expressions;
  • integer and decimal numbers.

An ANTLR grammar could look like this:

grammar Expression;

options {
  language=CSharp2;
}

parse
  :  exp EOF 
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-') mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/') unaryExp)*
  ;

unaryExp
  :  '-' atom 
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' 
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Now to create a proper AST, you add output=AST; in your options { ... } section, and you mix some "tree operators" in your grammar defining which tokens should be the root of a tree. There are two ways to do this:

  1. add ^ and ! after your tokens. The ^ causes the token to become a root and the ! excludes the token from the ast;
  2. by using "rewrite rules": ... -> ^(Root Child Child ...).

Take the rule foo for example:

foo
  :  TokenA TokenB TokenC TokenD
  ;

and let's say you want TokenB to become the root and TokenA and TokenC to become its children, and you want to exclude TokenD from the tree. Here's how to do that using option 1:

foo
  :  TokenA TokenB^ TokenC TokenD!
  ;

and here's how to do that using option 2:

foo
  :  TokenA TokenB TokenC TokenD -> ^(TokenB TokenA TokenC)
  ;

So, here's the grammar with the tree operators in it:

grammar Expression;

options {
  language=CSharp2;
  output=AST;
}

tokens {
  ROOT;
  UNARY_MIN;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse
  :  exp EOF -> ^(ROOT exp)
  ;

exp
  :  addExp
  ;

addExp
  :  mulExp (('+' | '-')^ mulExp)*
  ;

mulExp
  :  unaryExp (('*' | '/')^ unaryExp)*
  ;

unaryExp
  :  '-' atom -> ^(UNARY_MIN atom)
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' -> exp
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

I also added a Space rule to ignore any white spaces in the source file and added some extra tokens and namespaces for the lexer and parser. Note that the order is important (options { ... } first, then tokens { ... } and finally the @... {}-namespace declarations).

That's it.

Now generate a lexer and parser from your grammar file:

java -cp antlr-3.2.jar org.antlr.Tool Expression.g

and put the .cs files in your project together with the C# runtime DLL's.

You can test it using the following class:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Preorder(ITree Tree, int Depth) 
    {
      if(Tree == null)
      {
        return;
      }

      for (int i = 0; i < Depth; i++)
      {
        Console.Write("  ");
      }

      Console.WriteLine(Tree);

      Preorder(Tree.GetChild(0), Depth + 1);
      Preorder(Tree.GetChild(1), Depth + 1);
    }

    public static void Main (string[] args)
    {
      ANTLRStringStream Input = new ANTLRStringStream("(12.5 + 56 / -7) * 0.5"); 
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      ExpressionParser.parse_return ParseReturn = Parser.parse();
      CommonTree Tree = (CommonTree)ParseReturn.Tree;
      Preorder(Tree, 0);
    }
  }
}

which produces the following output:

ROOT
  *
    +
      12.5
      /
        56
        UNARY_MIN
          7
    0.5

which corresponds to the following AST:

(diagram created using graph.gafol.net)

Note that ANTLR 3.3 has just been released and the CSharp target is "in beta". That's why I used ANTLR 3.2 in my example.

In case of rather simple languages (like my example above), you could also evaluate the result on the fly without creating an AST. You can do that by embedding plain C# code inside your grammar file, and letting your parser rules return a specific value.

Here's an example:

grammar Expression;

options {
  language=CSharp2;
}

@parser::namespace { Demo.Antlr }
@lexer::namespace { Demo.Antlr }

parse returns [double value]
  :  exp EOF {$value = $exp.value;}
  ;

exp returns [double value]
  :  addExp {$value = $addExp.value;}
  ;

addExp returns [double value]
  :  a=mulExp       {$value = $a.value;}
     ( '+' b=mulExp {$value += $b.value;}
     | '-' b=mulExp {$value -= $b.value;}
     )*
  ;

mulExp returns [double value]
  :  a=unaryExp       {$value = $a.value;}
     ( '*' b=unaryExp {$value *= $b.value;}
     | '/' b=unaryExp {$value /= $b.value;}
     )*
  ;

unaryExp returns [double value]
  :  '-' atom {$value = -1.0 * $atom.value;}
  |  atom     {$value = $atom.value;}
  ;

atom returns [double value]
  :  Number      {$value = Double.Parse($Number.Text, CultureInfo.InvariantCulture);}
  |  '(' exp ')' {$value = $exp.value;}
  ;

Number
  :  ('0'..'9')+ ('.' ('0'..'9')+)?
  ;

Space 
  :  (' ' | '\t' | '\r' | '\n'){Skip();}
  ;

which can be tested with the class:

using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;

namespace Demo.Antlr
{
  class MainClass
  {
    public static void Main (string[] args)
    {
      string expression = "(12.5 + 56 / -7) * 0.5";
      ANTLRStringStream Input = new ANTLRStringStream(expression);  
      ExpressionLexer Lexer = new ExpressionLexer(Input);
      CommonTokenStream Tokens = new CommonTokenStream(Lexer);
      ExpressionParser Parser = new ExpressionParser(Tokens);
      Console.WriteLine(expression + " = " + Parser.parse());
    }
  }
}

and produces the following output:

(12.5 + 56 / -7) * 0.5 = 2.25

EDIT

In the comments, Ralph wrote:

Tip for those using Visual Studio: you can put something like java -cp "$(ProjectDir)antlr-3.2.jar" org.antlr.Tool "$(ProjectDir)Expression.g" in the pre-build events, then you can just modify your grammar and run the project without having to worry about rebuilding the lexer/parser.

这篇关于使用 ANTLR 3.3?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆