如何构造抽象语法树 [英] How to construct an abstract syntax tree

查看:85
本文介绍了如何构造抽象语法树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 AST 有一个大概的了解,但我想知道如何构建一个.

I have a general idea of what an AST is, but I want to know how to construct one.

如果给你一个语法和一个解析树,你如何构建 AST?

If you're given a grammar and a parse tree, how do you construct the AST?

如果给你一个语法和一个表达式,你会怎么做?

How do you do it if you're given a grammar and an expression?

推荐答案

好吧,首先,语法用于从表达式构建解析树.所以如果你已经有了一个解析树,你就不需要语法了.

Well, first off, the grammar is used to construct a parse tree from an expression. So if you already have a parse tree, you don't need the grammar.

根据你的解析器做了多少工作,解析表达式形成的结果树可能已经是一个抽象的语法树.或者它可以是一个简单的解析树,需要第二遍来构造 ast.

Depending on how much work your parser does, the resulting tree that is formed from parsing an expression could already be an abstract syntax tree. Or it could be a simple parse tree which requires a second pass to construct the ast.

要从语法和表达式构建解析树,您首先必须将语法转换为工作代码.通常,您会将工作拆分为一个分词器,分词器将表示表达式的输入流拆分为一个标记列表,以及一个解析器,该分析器获取标记列表并从中构建分析树ast.

To construct the parse tree from a grammar and an expression, you would first have to convert your grammar into working code. Typically, you would split the work into a tokenizer which splits the input stream representing the expression into a list of tokens, and a parser which takes the list of tokens and constructs a parse treeast from it.

所以表达式 1 + 2*(3+4) 可能会被拆分成这样的标记列表:

So the expression 1 + 2*(3+4) might be split into a list of tokens like this:

1 - int
+ - add_operator
2 - int
* - mul_operator
( - lparen
3 - int
+ - add_operator
4 - int
) - rparen

第一列是实际的文本值.第二个代表令牌类型.这些标记被送入解析器,解析器根据您的语法构建并识别标记并构建解析树.

The first column is the actual text value. The second represents the token type. These tokens are fed into the parser, which is built from your grammar and recognizes the tokens and builds the parse tree.

那么,如何编写词法标记器和实际解析器?你可以自己动手卷.或者,更常见的是,使用解析器生成器,如 coco 或 antlr 或 lex/yacc.这些工具对您的语法进行描述,并为 tokenzier 和解析器生成代码.(大多数流行语言和一些不受欢迎的语言都有代码生成器.)

So, how does one write the lexical tokenizer and the actual parser? You could roll your own by hand. Or, more commonly, use a parser generator like coco or antlr or lex/yacc. These tools take a description of your grammar and generate the code for a tokenzier and parser. (Code generators exist for most popular languages and some unpopular ones as well.)

您构建解析器的方式在很大程度上取决于您使用的语言.在 Haskell 中编写解析器的方式与在 C 中的方式完全不同.

How you construct your parser depends heavily on what language you use. How you would write a parser in Haskell is entirely different from how you would in, say, C.

Coco 是用于各种语言的解析器生成器,这也附带有关如何开始使用的文档.

Coco is a parser generator for various languages, which also comes with documentation on how to get started.

如果您喜欢 Python,那么 pyparsing 可能适合您.

If Python's your thing, then pyparsing maybe for you.

这篇关于如何构造抽象语法树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆