ANTLR:从CommonTree到有用的对象图 [英] ANTLR: From CommonTree to useful object graph

查看:83
本文介绍了ANTLR:从CommonTree到有用的对象图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我今天开始使用ANTLR并且我已经创建了一个基本的解析器。

I started using ANTLR today and I've created a basic parser.

解析后我最终得到了一棵树。对我来说,似乎这只是一堆 String 放在 Tree -nodes的树结构中。这对我来说不是很有用。我想要一个对象图。

After parsing I end up with a tree. To me it seems like this is just a bunch of Strings put together in a tree structure of Tree-nodes. That's not very useful to me. I'd like to have a graph of objects.

澄清(这是一个例子,而不是我真正的应用程序):对于5 -1 + 6我似乎最终得到:

To clarify (this is an example, and not my real application): For "5-1+6" I seem to end up with:

new String("PLUS")
    new String("MINUS")
        new String("5")
        new String("1")
    new String("6")

我觉得更有用:

new Plus(
    new Minus(
        new IntegerLiteral(5),
        new IntegerLiteral(1)),
    new IntegerLiteral(6))

从第一个表示到另一个表示最方便的方法是什么?在这篇文章中作者做类似的事情:

What is the most convenient way of going from the first representation to the other? In this article the author does something similar to this:

public Expression createExpr(CommonTree ast) {

    // ...

    switch (ast.getType()) {
    case SimpleExpressionParser.INT:
        return new IntegerLiteral(ast.getText())
    case SimpleExpressionParser.PLUS:
        return new Plus(createExpr((CommonTree)ast.getChild(0)),    // recurse
                        createExpr((CommonTree)ast.getChild(1)));   // recurse
    case SimpleExpressionParser.MINUS:
        return new Minus(createExpr((CommonTree)ast.getChild(0)),   // recurse
                         createExpr((CommonTree)ast.getChild(1)));  // recurse
    }

    // ...
}

这是首选方式吗?!我不能指示ANTLR以某种方式生成这个样板代码(它会很大)吗?

Is this the preferred way?! Can't I instruct ANTLR to generate this boiler-plate code somehow (it will be huge)?

可能相关的问题:

  • Converting Antlr syntax tree into useful objects (But I can't see how the answer answers my question.)

推荐答案

这是一种可能的方法。简而言之,这些是您要执行的步骤:

Here's a a possible way. In short, these are the step you'd perform:


  1. 创建将生成词法分析器和解析器的组合语法;

  2. 在(1)语法中混合AST重写规则,将标记的平面列表转换为适当的树;

  3. 编写可以遍历树的语法树从(2);

  4. 在树步行者中混合自定义代码;

  5. 测试它。

  1. create a combined grammar that will generate the lexer and parser;
  2. mix AST rewrite rules in the grammar from (1) to transform the flat list of tokens into a proper tree;
  3. write a tree grammar that can walk the tree from (2);
  4. mix custom code inside your tree walker;
  5. test it.



1



让我们创建一个支持 + 的小型表达式解析器, - * / (...)和数字,可能如下所示:

1

Let's create a small expression parser supporting +, -, *, /, (...) and numbers, which could look like:

grammar Exp; // file: Exp.g

eval
  :  exp EOF
  ;

exp 
  :  addExp 
  ;

addExp
  :  mulExp ((Add | Sub) mulExp)*
  ;

mulExp
  :  unaryExp ((Mul | Div) unaryExp)*
  ;

unaryExp
  :  Sub atom
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')'
  ;

Add    : '+';
Sub    : '-';
Mul    : '*';
Div    : '/';
Number : '0'..'9'+;
Space  : ' ' {skip();};



2



包括重写规则,它将如下所示:

2

Including rewrite rules, it will look like:

grammar Exp; // file: Exp.g

options {
  output=AST;
}

tokens {
  U_SUB;
}

eval 
  :  exp EOF -> exp
  ;

exp 
  :  addExp 
  ;

addExp
  :  mulExp ((Add | Sub)^ mulExp)*
  ;

mulExp
  :  unaryExp ((Mul | Div)^ unaryExp)*
  ;

unaryExp
  :  Sub atom -> ^(U_SUB atom)
  |  atom
  ;

atom
  :  Number
  |  '(' exp ')' -> exp
  ;

Add    : '+';
Sub    : '-';
Mul    : '*';
Div    : '/';
Number : '0'..'9'+;
Space  : ' ' {skip();};

现在表达式如 10 - 2 *(3 + 8)将转换为:

Now an expression like 10 - 2 * (3 + 8) will be transformed to:

创建树语法,为生成的AST生成迭代器(2),你会做这样的事情:

To create a tree grammar that generates an iterator for the AST generated in (2), you'd do something like this:

tree grammar ExpWalker; // file: ExpWalker.g

options {
  tokenVocab=Exp; // use the tokens from Exp.g
  ASTLabelType=CommonTree;
}

eval
  :  exp
  ;

exp
  :  ^(Add exp exp)
  |  ^(Sub exp exp)
  |  ^(Mul exp exp)
  |  ^(Div exp exp)
  |  ^(U_SUB exp)
  |  Number
  ;



4



并混合你的定制在这个树迭代器中的类,做这样的事情:

4

And to mix your custom classes in this tree iterator, do something like this:

tree grammar ExpWalker; // file: ExpWalker.g

options {
  tokenVocab=Exp; // use the tokens from Exp.g
  ASTLabelType=CommonTree;
}

eval returns [ExpNode e]
  :  exp {e = $exp.e;}
  ;

exp returns [ExpNode e]
  :  ^(Add a=exp b=exp) {e = new AddExp($a.e, $b.e);}
  |  ^(Sub a=exp b=exp) {e = new SubExp($a.e, $b.e);}
  |  ^(Mul a=exp b=exp) {e = new MulExp($a.e, $b.e);}
  |  ^(Div a=exp b=exp) {e = new DivExp($a.e, $b.e);}
  |  ^(U_SUB a=exp)     {e = new UnaryExp($a.e);}
  |  Number             {e = new NumberExp($Number.text);}
  ;



5



这里有一些要测试的代码所有类(仅限于一个文件: Main.java ):

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String source = "10 - 2 * (3 + 8)";
    ExpLexer lexer = new ExpLexer(new ANTLRStringStream(source));
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    ExpParser parser = new ExpParser(tokens);
    ExpParser.eval_return returnValue = parser.eval();
    CommonTree tree = (CommonTree)returnValue.getTree();
    CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
    ExpWalker walker = new ExpWalker(nodes);
    ExpNode root = walker.eval();
    System.out.println(source + " = " + root.evaluate());
  }
}

interface ExpNode {
  double evaluate();
}

class NumberExp implements ExpNode {

  final double num;

  NumberExp(String s) {
    num = Double.parseDouble(s);
  }

  @Override
  public double evaluate() {
    return num;
  }
}

class AddExp implements ExpNode {

  final ExpNode left, right;

  AddExp(ExpNode a, ExpNode b) {
    left = a;
    right = b;
  }

  @Override
  public double evaluate() {
    return left.evaluate() + right.evaluate();
  }
}

class SubExp implements ExpNode {

  final ExpNode left, right;

  SubExp(ExpNode a, ExpNode b) {
    left = a;
    right = b;
  }

  @Override
  public double evaluate() {
    return left.evaluate() - right.evaluate();
  }
}

class MulExp implements ExpNode {

  final ExpNode left, right;

  MulExp(ExpNode a, ExpNode b) {
    left = a;
    right = b;
  }

  @Override
  public double evaluate() {
    return left.evaluate() * right.evaluate();
  }
}

class DivExp implements ExpNode {

  final ExpNode left, right;

  DivExp(ExpNode a, ExpNode b) {
    left = a;
    right = b;
  }

  @Override
  public double evaluate() {
    return left.evaluate() / right.evaluate();
  }
}

class UnaryExp implements ExpNode {

  final ExpNode exp;

  UnaryExp(ExpNode e) {
    exp = e;
  }

  @Override
  public double evaluate() {
    return -exp.evaluate();
  }
}

然后执行:


# generate a lexer & parser
java -cp antlr-3.2.jar org.antlr.Tool Exp.g

# generate the tree walker
java -cp antlr-3.2.jar org.antlr.Tool ExpWalker.g

# compile everything
javac -cp antlr-3.2.jar *.java

# run the main class
java -cp .:antlr-3.2.jar Main         # *nix 
java -cp .;antlr-3.2.jar Main         # Windows

打印:


10 - 2 * (3 + 8) = -12.0

您可以跳过树行走者并混合所有代码并返回 ...] 在你的组合语法中,但IMO,一个树语法使事情更有序,因为词法分析器规则,以及像等已从中移除。

You could skip the tree-walker and mix all the code and returns [...] inside your combined grammar, but IMO, a tree grammar keeps things more orderly because the lexer rules, and tokens like ( and ) etc. are removed from it.

HTH

这篇关于ANTLR:从CommonTree到有用的对象图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆