如何使用 ANTLR4 创建 AST? [英] How to create AST with ANTLR4?

查看:23
本文介绍了如何使用 ANTLR4 创建 AST?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在搜索有关此内容的大量内容,但找不到任何真正可以帮助我构建 AST 的有用信息.我已经知道 ANTLR4 不像以前的 ANTLR3 那样构建 AST.每个人都说:嘿,使用访问者!",但我找不到任何示例或更详细的说明来说明如何执行此操作...

I've been searching A LOT about this and I couldn't find anything useful that REALLY helps me build an AST. I already know that ANTLR4 doesn't build AST like ANTLR3 used to do. Everyone say: "Hey, use visitors!", but I couldn't find any example or more detailed explanation on HOW can I do this...

我的语法必须像 C,但每个命令都是用葡萄牙语(葡萄牙语编程语言)编写的.我可以使用 ANTLR4 轻松生成解析树.我的问题是:我现在需要做什么来创建 AST?

I have a grammar must like C, but with every commands written in Portuguese (portuga programming language). I can easily generate the parse tree using ANTLR4. My question is: What I need to do now to create an AST?

顺便说一句,我正在使用 Java 和 IntelliJ...

BTW, I'm using Java and IntelliJ...

我能得到的最接近的是使用这个主题的答案:是否有使用antlr4从java源代码创建AST并提取方法、变量的简单示例和评论?但它只打印被访问方法的名称..

The closest I could get was using the answer of this topic: Is there a simple example of using antlr4 to create an AST from java source code and extract methods, variables and comments? But it only prints the name of the visited methods..

由于第一次尝试对我没有预期的效果,我尝试使用 本教程来自 ANTLR3,但我不知道如何使用 StringTamplate 而不是 ST...

Since the first attempt didn't work for me as I expected, I tried to use this tutorial from ANTLR3, but I couldn't figure out how to use StringTamplate instead of ST...

阅读这本书 最终的 ANTLR 4 参考我也找不到与 AST 相关的任何内容.

Reading the book The Definitive ANTLR 4 Reference I also couldn't find anything related to ASTs.

现在我有一个类来创建 DOT 文件,我只需要弄清楚如何正确使用访问者

Now I have one class to create the DOT file, I just need figure out on how to use visitors properly

推荐答案

好的,让我们构建一个简单的数学示例.构建 AST 对于这样的任务来说完全是矫枉过正,但它是展示原理的好方法.

Ok, let's build a simple math example. Building an AST is totally overkill for such a task but it's a nice way to show the principle.

我会用 C# 来做,但 Java 版本会非常相似.

I'll do it in C# but the Java version would be very similar.

首先,让我们编写一个非常基本的数学语法来使用:

First, let's write a very basic math grammar to work with:

grammar Math;

compileUnit
    :   expr EOF
    ;

expr
    :   '(' expr ')'                         # parensExpr
    |   op=('+'|'-') expr                    # unaryExpr
    |   left=expr op=('*'|'/') right=expr    # infixExpr
    |   left=expr op=('+'|'-') right=expr    # infixExpr
    |   func=ID '(' expr ')'                 # funcExpr
    |   value=NUM                            # numberExpr
    ;

OP_ADD: '+';
OP_SUB: '-';
OP_MUL: '*';
OP_DIV: '/';

NUM :   [0-9]+ ('.' [0-9]+)? ([eE] [+-]? [0-9]+)?;
ID  :   [a-zA-Z]+;
WS  :   [ 	
] -> channel(HIDDEN);

非常基本的东西,我们有一个 expr 规则来处理所有事情(优先规则等).

Pretty basic stuff, we have a single expr rule that handles everything (precedence rules etc).

然后,让我们定义一些我们将使用的 AST 节点.这些是完全自定义的,您可以按照自己的方式定义它们.

Then, let's define some AST nodes we'll use. These are totally custom and you can define them in the way you want to.

以下是我们将在此示例中使用的节点:

Here are the nodes we'll be using for this example:

internal abstract class ExpressionNode
{
}

internal abstract class InfixExpressionNode : ExpressionNode
{
    public ExpressionNode Left { get; set; }
    public ExpressionNode Right { get; set; }
}

internal class AdditionNode : InfixExpressionNode
{
}

internal class SubtractionNode : InfixExpressionNode
{
}

internal class MultiplicationNode : InfixExpressionNode
{
}

internal class DivisionNode : InfixExpressionNode
{
}

internal class NegateNode : ExpressionNode
{
    public ExpressionNode InnerNode { get; set; }
}

internal class FunctionNode : ExpressionNode
{
    public Func<double, double> Function { get; set; }
    public ExpressionNode Argument { get; set; }
}

internal class NumberNode : ExpressionNode
{
    public double Value { get; set; }
}

将 CST 转换为 AST

ANTLR 为我们生成了 CST 节点(MathParser.*Context 类).我们现在必须将这些转换为 AST 节点.

Converting a CST to an AST

ANTLR generated the CST nodes for us (the MathParser.*Context classes). We now have to convert these to AST nodes.

这对访问者来说很容易完成,ANTLR 为我们提供了一个 MathBaseVisitor 类,让我们来处理它.

This is easily done with a visitor, and ANTLR provides us with a MathBaseVisitor<T> class, so let's work with that.

internal class BuildAstVisitor : MathBaseVisitor<ExpressionNode>
{
    public override ExpressionNode VisitCompileUnit(MathParser.CompileUnitContext context)
    {
        return Visit(context.expr());
    }

    public override ExpressionNode VisitNumberExpr(MathParser.NumberExprContext context)
    {
        return new NumberNode
        {
            Value = double.Parse(context.value.Text, NumberStyles.AllowDecimalPoint | NumberStyles.AllowExponent)
        };
    }

    public override ExpressionNode VisitParensExpr(MathParser.ParensExprContext context)
    {
        return Visit(context.expr());
    }

    public override ExpressionNode VisitInfixExpr(MathParser.InfixExprContext context)
    {
        InfixExpressionNode node;

        switch (context.op.Type)
        {
            case MathLexer.OP_ADD:
                node = new AdditionNode();
                break;

            case MathLexer.OP_SUB:
                node = new SubtractionNode();
                break;

            case MathLexer.OP_MUL:
                node = new MultiplicationNode();
                break;

            case MathLexer.OP_DIV:
                node = new DivisionNode();
                break;

            default:
                throw new NotSupportedException();
        }

        node.Left = Visit(context.left);
        node.Right = Visit(context.right);

        return node;
    }

    public override ExpressionNode VisitUnaryExpr(MathParser.UnaryExprContext context)
    {
        switch (context.op.Type)
        {
            case MathLexer.OP_ADD:
                return Visit(context.expr());

            case MathLexer.OP_SUB:
                return new NegateNode
                {
                    InnerNode = Visit(context.expr())
                };

            default:
                throw new NotSupportedException();
        }
    }

    public override ExpressionNode VisitFuncExpr(MathParser.FuncExprContext context)
    {
        var functionName = context.func.Text;

        var func = typeof(Math)
            .GetMethods(BindingFlags.Public | BindingFlags.Static)
            .Where(m => m.ReturnType == typeof(double))
            .Where(m => m.GetParameters().Select(p => p.ParameterType).SequenceEqual(new[] { typeof(double) }))
            .FirstOrDefault(m => m.Name.Equals(functionName, StringComparison.OrdinalIgnoreCase));

        if (func == null)
            throw new NotSupportedException(string.Format("Function {0} is not supported", functionName));

        return new FunctionNode
        {
            Function = (Func<double, double>)func.CreateDelegate(typeof(Func<double, double>)),
            Argument = Visit(context.expr())
        };
    }
}

如您所见,这只是通过使用访问者从 CST 节点创建 AST 节点的问题.代码应该是不言自明的(好吧,也许除了 VisitFuncExpr 的东西,但这只是将委托连接到 System.Math 类).

As you can see, it's just a matter of creating an AST node out of a CST node by using a visitor. The code should be pretty self-explanatory (well, maybe except for the VisitFuncExpr stuff, but it's just a quick way to wire up a delegate to a suitable method of the System.Math class).

这里有 AST 构建的东西.这就是所需要的.只需从 CST 中提取相关信息并将其保存在 AST 中即可.

And here you have the AST building stuff. That's all that's needed. Just extract the relevant information from the CST and keep it in the AST.

现在,让我们玩一下 AST.我们必须构建一个 AST 访问者基类来遍历它.让我们做一些类似于ANTLR提供的AbstractParseTreeVisitor.

Now, let's play a bit with the AST. We'll have to build an AST visitor base class to traverse it. Let's just do something similar to the AbstractParseTreeVisitor<T> provided by ANTLR.

internal abstract class AstVisitor<T>
{
    public abstract T Visit(AdditionNode node);
    public abstract T Visit(SubtractionNode node);
    public abstract T Visit(MultiplicationNode node);
    public abstract T Visit(DivisionNode node);
    public abstract T Visit(NegateNode node);
    public abstract T Visit(FunctionNode node);
    public abstract T Visit(NumberNode node);

    public T Visit(ExpressionNode node)
    {
        return Visit((dynamic)node);
    }
}

在这里,我利用 C# 的 dynamic 关键字在一行代码中执行了双重调度.在 Java 中,您必须自己使用一系列 if 语句进行接线,如下所示:

Here, I took advantage of C#'s dynamic keyword to perform a double-dispatch in one line of code. In Java, you'll have to do the wiring yourself with a sequence of if statements like these:

if (node is AdditionNode) {
    return Visit((AdditionNode)node);
} else if (node is SubtractionNode) {
    return Visit((SubtractionNode)node);
} else if ...

但我只是选择了这个例子的快捷方式.

But I just went for the shortcut for this example.

那么,我们可以用数学表达式树做什么?当然要评价!让我们实现一个表达式评估器:

So, what can we do with a math expression tree? Evaluate it, of course! Let's implement an expression evaluator:

internal class EvaluateExpressionVisitor : AstVisitor<double>
{
    public override double Visit(AdditionNode node)
    {
        return Visit(node.Left) + Visit(node.Right);
    }

    public override double Visit(SubtractionNode node)
    {
        return Visit(node.Left) - Visit(node.Right);
    }

    public override double Visit(MultiplicationNode node)
    {
        return Visit(node.Left) * Visit(node.Right);
    }

    public override double Visit(DivisionNode node)
    {
        return Visit(node.Left) / Visit(node.Right);
    }

    public override double Visit(NegateNode node)
    {
        return -Visit(node.InnerNode);
    }

    public override double Visit(FunctionNode node)
    {
        return node.Function(Visit(node.Argument));
    }

    public override double Visit(NumberNode node)
    {
        return node.Value;
    }
}

一旦我们有了 AST 就很简单了,不是吗?

Pretty simple once we have an AST, isn't it?

最后但并非最不重要的是,我们必须实际编写主程序:

Last but not least, we have to actually write the main program:

internal class Program
{
    private static void Main()
    {
        while (true)
        {
            Console.Write("> ");
            var exprText = Console.ReadLine();

            if (string.IsNullOrWhiteSpace(exprText))
                break;

            var inputStream = new AntlrInputStream(new StringReader(exprText));
            var lexer = new MathLexer(inputStream);
            var tokenStream = new CommonTokenStream(lexer);
            var parser = new MathParser(tokenStream);

            try
            {
                var cst = parser.compileUnit();
                var ast = new BuildAstVisitor().VisitCompileUnit(cst);
                var value = new EvaluateExpressionVisitor().Visit(ast);

                Console.WriteLine("= {0}", value);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }

            Console.WriteLine();
        }
    }
}

现在我们终于可以玩了:

And now we can finally play with it:

这篇关于如何使用 ANTLR4 创建 AST?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆