如何使用分析树? [英] How are parse trees used?

查看:217
本文介绍了如何使用分析树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在学习解析器。我一直在看视频,并试图写代码,但我开始很难理解。我认为可能会理解一个解析器的动机可以帮助我们理解它们的工作原理以及它们应该如何构建。

I'm currently learning about parsers. I've been watching videos and trying to write code, but I'm starting to have a hard time understanding. I thought maybe understand the motivation for a parser could help to understand how they work and how they should be built.

解析器将获取一串令牌并创建一个解析树。我可以理解一个解析树是什么,但我只是不知道如何使用。最终,编译器使用解析树来创建机器代码,但它究竟是如何做到这一点?可以有人给我一个例子吗?

So the goal of a parser is to take a string of tokens and create a parse tree. I can understand what a parse tree is, but I just don't see how it's used. Ultimately, a compiler uses a parse tree to create machine code, but exactly how does it do this? Can somebody show me an example?

还有什么是解析(和解析树)用于?

What else is parsing (and parse trees) used for?

推荐答案

想象你想为数学表达式制作一种语言。用户可以输入

Imagine you want to make a language for math expressions. The user might input

(3 + 4)* 36

编译器会为该输入创建一个类似于

The compiler would create a parse tree for that input that looks like

      * 
     / \
    +   36
   / \
  3   4

可以生成机器代码来评估这个数学表达式,通过递归遍历树,发出孩子的指令,然后自己。也称为后序遍历。在这种情况下发出的指令的顺序是:

A simple compiler could generate machine code to evaluate this math expression by recursively walking through the tree, emitting the children's instructions, then its own. Aka a post-order traversal. The order of instructions emitted in this case would be:


  1. 将数字3放在堆栈上的说明

  2. 将数字4放在堆栈上的说明

  3. 将堆栈顶端的2个元素弹出,添加并将结果放入堆栈的说明

  4. 将数字36放在堆栈上的说明

  5. 弹出堆栈顶端2个元素,将它们相乘,并将结果放入堆栈的说明

  1. Instructions for putting the number 3 on the stack
  2. Instructions for putting the number 4 on the stack
  3. Instructions for popping the top 2 elements off the stack, adding them, and putting the result on the stack
  4. Instructions for putting the number 36 on the stack
  5. Instructions for popping the top 2 elements on the stack, multiplying them, and putting the result on the stack.

这段代码以这种方式编译树。当你运行程序时,它打印评估表达式所需的MIPS汇编指令。

This code compiles the tree in exactly that way. When you run the program it prints the MIPS assembly instructions needed to evaluate the expression.

#include <stdio.h>

enum TYPE {
    ADD,
    MULTIPLY,
    NUMBER
};

struct tree {
    enum TYPE type;
    int number_val;
    struct tree* left;
    struct tree* right;
};

void emit(struct tree* node);

void emitNumber(struct tree* node) {
    // load the 32-bit number into a register
    printf("lui $t0, %d\n", (node->number_val) & 0xFFFF0000);
    printf("ori $t0, $t0, %d\n", (node->number_val) & 0x0000FFFF);

    // put the number on the stack
    puts("addi $sp, $sp, -4");        
    puts("sw $t0, 0($sp)");
}

void emitAdd(struct tree* node) {
    emit(node->left);
    emit(node->right);

    // pop the left and right args off the stack and put them in registers
    puts("lw $t0, 0($sp)");
    puts("addi $sp, $sp, +4");
    puts("lw $t1, 0($sp)");
    puts("addi $sp, $sp, +4");

    // add them and put the result on the stack    
    puts("add $t2, $t0, $t1");
    puts("addi $sp, $sp, -4");
    puts("sw $t2, 0($sp)");
}

void emitMult(struct tree* node) {
    emit(node->left);
    emit(node->right);

    // pop the left and right args off the stack and put them in registers
    puts("lw $t0, 0($sp)");
    puts("addi $sp, $sp, +4");
    puts("lw $t1, 0($sp)");
    puts("addi $sp, $sp, +4");

    // multiply them and put the result on the stack    
    puts("mul $t2, $t0, $t1");
    puts("addi $sp, $sp, -4");
    puts("sw $t2, 0($sp)");
}

void emit(struct tree* node) {
    if (node == NULL) {
        return;
    }

    switch (node->type) {
        case NUMBER:
            emitNumber(node);
            break;
        case ADD:
            emitAdd(node);
            break;
        case MULTIPLY:
            emitMult(node);
            break;
    }
}

int main() {
    // create an example tree
    struct tree three = { NUMBER, 3, NULL, NULL };
    struct tree four = { NUMBER, 4, NULL, NULL };
    struct tree thirtysix = { NUMBER, 36, NULL, NULL };
    struct tree add = { ADD, 0, &three, &four };
    struct tree mult = { MULTIPLY, 0, &add, &thirtysix };

    emit(&mult);

    // put the calculated result in register $t0
    puts("lw $t0, 0($sp)");
}

您可以在MIPS模拟器上测试输出,如MARS或SPIM。最后,注册 $ t0 保存结果 252 这是答案!

You can test the output on a MIPS simulator like MARS or SPIM. At the end, register $t0 holds the result 252 which is the answer!

要创建一个完整的语言的编译器,需要更多类型的树节点和更多的emit函数。你还需要考虑如何在函数调用期间在堆栈上保存/恢复变量。您还希望编译器在各个体系结构中工作。这里有几个解决方案...你可以发出字节码在虚拟机上运行,​​像Python或Java或C#。或者你可以编译到一个中间程序集,像Clang用LLVM,通过另一个编译阶段来定位的确切的架构。

To make a compiler for a full fledged language it would need more types of nodes in the tree and more emit functions. You would also need to put some thought into how to save/restore variables on the stack during function calls. You would also want the compiler to work across architectures. There are a few solutions for this... you could emit bytecode that runs on a virtual machine like how Python or Java or C# does. Or you could compile to an intermediate assembly like Clang does with LLVM, that goes through another compilation stage to target the exact architecture.

希望这给你一些感觉如何容易它是遍历树来生成实际的指令,以及为什么你更喜欢这种树表示在文本上。

Hopefully this gives you some sense of how easy it is to traverse a tree to generate actual instructions, and why you would prefer this tree representation over text.

这篇关于如何使用分析树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆