ANTLR:从CommonTree到有用的对象图 [英] ANTLR: From CommonTree to useful object graph
问题描述
我今天开始使用ANTLR并且我已经创建了一个基本的解析器。
I started using ANTLR today and I've created a basic parser.
解析后我最终得到了一棵树。对我来说,似乎这只是一堆 String
放在 Tree
-nodes的树结构中。这对我来说不是很有用。我想要一个对象图。
After parsing I end up with a tree. To me it seems like this is just a bunch of String
s put together in a tree structure of Tree
-nodes. That's not very useful to me. I'd like to have a graph of objects.
澄清(这是一个例子,而不是我真正的应用程序):对于5 -1 + 6
我似乎最终得到:
To clarify (this is an example, and not my real application): For "5-1+6"
I seem to end up with:
new String("PLUS")
new String("MINUS")
new String("5")
new String("1")
new String("6")
我觉得更有用:
new Plus(
new Minus(
new IntegerLiteral(5),
new IntegerLiteral(1)),
new IntegerLiteral(6))
从第一个表示到另一个表示最方便的方法是什么?在这篇文章中作者做类似的事情:
What is the most convenient way of going from the first representation to the other? In this article the author does something similar to this:
public Expression createExpr(CommonTree ast) {
// ...
switch (ast.getType()) {
case SimpleExpressionParser.INT:
return new IntegerLiteral(ast.getText())
case SimpleExpressionParser.PLUS:
return new Plus(createExpr((CommonTree)ast.getChild(0)), // recurse
createExpr((CommonTree)ast.getChild(1))); // recurse
case SimpleExpressionParser.MINUS:
return new Minus(createExpr((CommonTree)ast.getChild(0)), // recurse
createExpr((CommonTree)ast.getChild(1))); // recurse
}
// ...
}
这是首选方式吗?!我不能指示ANTLR以某种方式生成这个样板代码(它会很大)吗?
Is this the preferred way?! Can't I instruct ANTLR to generate this boiler-plate code somehow (it will be huge)?
可能相关的问题:
- 将Antlr语法树转换为有用的对象(但我无法看到答案如何回答我的问题。)
- Converting Antlr syntax tree into useful objects (But I can't see how the answer answers my question.)
推荐答案
这是一种可能的方法。简而言之,这些是您要执行的步骤:
Here's a a possible way. In short, these are the step you'd perform:
- 创建将生成词法分析器和解析器的组合语法;
- 在(1)语法中混合AST重写规则,将标记的平面列表转换为适当的树;
- 编写可以遍历树的语法树从(2);
- 在树步行者中混合自定义代码;
- 测试它。
- create a combined grammar that will generate the lexer and parser;
- mix AST rewrite rules in the grammar from (1) to transform the flat list of tokens into a proper tree;
- write a tree grammar that can walk the tree from (2);
- mix custom code inside your tree walker;
- test it.
1
让我们创建一个支持 +
的小型表达式解析器, -
, *
, /
, (...)
和数字,可能如下所示:
1
Let's create a small expression parser supporting +
, -
, *
, /
, (...)
and numbers, which could look like:
grammar Exp; // file: Exp.g
eval
: exp EOF
;
exp
: addExp
;
addExp
: mulExp ((Add | Sub) mulExp)*
;
mulExp
: unaryExp ((Mul | Div) unaryExp)*
;
unaryExp
: Sub atom
| atom
;
atom
: Number
| '(' exp ')'
;
Add : '+';
Sub : '-';
Mul : '*';
Div : '/';
Number : '0'..'9'+;
Space : ' ' {skip();};
2
包括重写规则,它将如下所示:
2
Including rewrite rules, it will look like:
grammar Exp; // file: Exp.g
options {
output=AST;
}
tokens {
U_SUB;
}
eval
: exp EOF -> exp
;
exp
: addExp
;
addExp
: mulExp ((Add | Sub)^ mulExp)*
;
mulExp
: unaryExp ((Mul | Div)^ unaryExp)*
;
unaryExp
: Sub atom -> ^(U_SUB atom)
| atom
;
atom
: Number
| '(' exp ')' -> exp
;
Add : '+';
Sub : '-';
Mul : '*';
Div : '/';
Number : '0'..'9'+;
Space : ' ' {skip();};
现在表达式如 10 - 2 *(3 + 8)
将转换为:
Now an expression like 10 - 2 * (3 + 8)
will be transformed to:
创建树语法,为生成的AST生成迭代器(2),你会做这样的事情:
To create a tree grammar that generates an iterator for the AST generated in (2), you'd do something like this:
tree grammar ExpWalker; // file: ExpWalker.g
options {
tokenVocab=Exp; // use the tokens from Exp.g
ASTLabelType=CommonTree;
}
eval
: exp
;
exp
: ^(Add exp exp)
| ^(Sub exp exp)
| ^(Mul exp exp)
| ^(Div exp exp)
| ^(U_SUB exp)
| Number
;
4
并混合你的定制在这个树迭代器中的类,做这样的事情:
4
And to mix your custom classes in this tree iterator, do something like this:
tree grammar ExpWalker; // file: ExpWalker.g
options {
tokenVocab=Exp; // use the tokens from Exp.g
ASTLabelType=CommonTree;
}
eval returns [ExpNode e]
: exp {e = $exp.e;}
;
exp returns [ExpNode e]
: ^(Add a=exp b=exp) {e = new AddExp($a.e, $b.e);}
| ^(Sub a=exp b=exp) {e = new SubExp($a.e, $b.e);}
| ^(Mul a=exp b=exp) {e = new MulExp($a.e, $b.e);}
| ^(Div a=exp b=exp) {e = new DivExp($a.e, $b.e);}
| ^(U_SUB a=exp) {e = new UnaryExp($a.e);}
| Number {e = new NumberExp($Number.text);}
;
5
这里有一些要测试的代码所有类(仅限于一个文件: Main.java
):
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
public class Main {
public static void main(String[] args) throws Exception {
String source = "10 - 2 * (3 + 8)";
ExpLexer lexer = new ExpLexer(new ANTLRStringStream(source));
CommonTokenStream tokens = new CommonTokenStream(lexer);
ExpParser parser = new ExpParser(tokens);
ExpParser.eval_return returnValue = parser.eval();
CommonTree tree = (CommonTree)returnValue.getTree();
CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
ExpWalker walker = new ExpWalker(nodes);
ExpNode root = walker.eval();
System.out.println(source + " = " + root.evaluate());
}
}
interface ExpNode {
double evaluate();
}
class NumberExp implements ExpNode {
final double num;
NumberExp(String s) {
num = Double.parseDouble(s);
}
@Override
public double evaluate() {
return num;
}
}
class AddExp implements ExpNode {
final ExpNode left, right;
AddExp(ExpNode a, ExpNode b) {
left = a;
right = b;
}
@Override
public double evaluate() {
return left.evaluate() + right.evaluate();
}
}
class SubExp implements ExpNode {
final ExpNode left, right;
SubExp(ExpNode a, ExpNode b) {
left = a;
right = b;
}
@Override
public double evaluate() {
return left.evaluate() - right.evaluate();
}
}
class MulExp implements ExpNode {
final ExpNode left, right;
MulExp(ExpNode a, ExpNode b) {
left = a;
right = b;
}
@Override
public double evaluate() {
return left.evaluate() * right.evaluate();
}
}
class DivExp implements ExpNode {
final ExpNode left, right;
DivExp(ExpNode a, ExpNode b) {
left = a;
right = b;
}
@Override
public double evaluate() {
return left.evaluate() / right.evaluate();
}
}
class UnaryExp implements ExpNode {
final ExpNode exp;
UnaryExp(ExpNode e) {
exp = e;
}
@Override
public double evaluate() {
return -exp.evaluate();
}
}
然后执行:
# generate a lexer & parser
java -cp antlr-3.2.jar org.antlr.Tool Exp.g
# generate the tree walker
java -cp antlr-3.2.jar org.antlr.Tool ExpWalker.g
# compile everything
javac -cp antlr-3.2.jar *.java
# run the main class
java -cp .:antlr-3.2.jar Main # *nix
java -cp .;antlr-3.2.jar Main # Windows
打印:
10 - 2 * (3 + 8) = -12.0
您可以跳过树行走者并混合所有代码并返回 ...]
在你的组合语法中,但IMO,一个树语法使事情更有序,因为词法分析器规则,以及像(
和)
等已从中移除。
You could skip the tree-walker and mix all the code and returns [...]
inside your combined grammar, but IMO, a tree grammar keeps things more orderly because the lexer rules, and tokens like (
and )
etc. are removed from it.
HTH
这篇关于ANTLR:从CommonTree到有用的对象图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!