如何构造抽象语法树 [英] How to construct an abstract syntax tree

查看:745
本文介绍了如何构造抽象语法树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对AST是什么有一个大致的了解,但我想知道如何构造它。

I have a general idea of what an AST is, but I want to know how to construct one.

如果您获得了语法和语法分析树,如何构造AST?

If you're given a grammar and a parse tree, how do you construct the AST?

如果给了语法和表达式怎么办?

How do you do it if you're given a grammar and an expression?

推荐答案

首先,语法用于从表达式构造解析树。因此,如果您已经有一个语法分析树,则不需要语法。

Well, first off, the grammar is used to construct a parse tree from an expression. So if you already have a parse tree, you don't need the grammar.

根据您的语法分析器所做的工作,通过解析一个表达式可能已经是抽象语法树。或者它可能是一个简单的分析树,需要第二遍才能构造ast。

Depending on how much work your parser does, the resulting tree that is formed from parsing an expression could already be an abstract syntax tree. Or it could be a simple parse tree which requires a second pass to construct the ast.

要从语法和表达式构造分析树,您首先必须将您的语法转换为工作代码。通常,您会将工作拆分为令牌生成器,该令牌生成器将表示表达式的输入流拆分为令牌列表,以及解析器获取令牌列表并从中构造一个解析树。

To construct the parse tree from a grammar and an expression, you would first have to convert your grammar into working code. Typically, you would split the work into a tokenizer which splits the input stream representing the expression into a list of tokens, and a parser which takes the list of tokens and constructs a parse tree\ast from it.

因此,表达式 1 + 2 *(3 + 4)可能会分为以下令牌列表:

So the expression 1 + 2*(3+4) might be split into a list of tokens like this:

1 - int
+ - add_operator
2 - int
* - mul_operator
( - lparen
3 - int
+ - add_operator
4 - int
) - rparen

第一列是实际文本值。第二个代表令牌类型。这些标记被馈送到解析器中,该解析器是根据您的语法构建的,可以识别标记并构建解析树。

The first column is the actual text value. The second represents the token type. These tokens are fed into the parser, which is built from your grammar and recognizes the tokens and builds the parse tree.

因此,如何编写词法标记器和实际的解析器?您可以手动滚动。或者,更常见的是使用解析器生成器,例如coco或antlr或lex / yacc。这些工具将对您的语法进行描述,并为tokenzier和解析器生成代码。 (代码生成器适用于大多数流行的语言,也存在一些不流行的语言。)

So, how does one write the lexical tokenizer and the actual parser? You could roll your own by hand. Or, more commonly, use a parser generator like coco or antlr or lex/yacc. These tools take a description of your grammar and generate the code for a tokenzier and parser. (Code generators exist for most popular languages and some unpopular ones as well.)

构造解析器的方式在很大程度上取决于所使用的语言。在Haskell中编写解析器的方式与在C语言中编写解析器的方式完全不同。

How you construct your parser depends heavily on what language you use. How you would write a parser in Haskell is entirely different from how you would in, say, C.

可可是各种语言的解析器生成器,其中
也随附有关如何入门的文档。

Coco is a parser generator for various languages, which also comes with documentation on how to get started.

如果您是Python的人,那么 pyparsing 也许适合您。

If Python's your thing, then pyparsing maybe for you.

这篇关于如何构造抽象语法树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆