如何从头开始编写递归下降解析器? [英] How to write a recursive descent parser from scratch?

查看:86
本文介绍了如何从头开始编写递归下降解析器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为纯粹的学术练习,我从头开始编写递归下降解析器-无需使用ANTLR或lex/yacc.

As a purely academic exercise, I'm writing a recursive descent parser from scratch -- without using ANTLR or lex/yacc.

我正在编写一个简单的函数,它将数学表达式转换为等效的AST.我有以下内容:

I'm writing a simple function which converts math expressions into their equivalent AST. I have the following:

// grammar
type expr =
    | Lit of float
    | Add of expr * expr
    | Mul of expr * expr
    | Div of expr * expr
    | Sub of expr * expr

// tokens
type tokens =
    | Num of float
    | LParen | RParen
    | XPlus | XStar | XMinus | XSlash

let tokenize (input : string) =
    Regex.Matches(input.Replace(" ", ""), "\d+|[+/*\-()]")
    |> Seq.cast<Match>
    |> Seq.map (fun x -> x.Value)
    |> Seq.map (function
        | "+" -> XPlus
        | "-" -> XMinus
        | "/" -> XSlash
        | "*" -> XStar
        | "(" -> LParen
        | ")" -> RParen
        | num -> Num(float num))
    |> Seq.to_list

因此,tokenize "10 * (4 + 5) - 1"返回以下令牌流:

So, tokenize "10 * (4 + 5) - 1" returns the following token stream:

[Num 10.0; XStar; LParen; Num 4.0; XPlus; Num 5.0; RParen; XMinus; Num 1.0]

在这一点上,我想根据操作符优先级将令牌流映射到其AST:

At this point, I'd like to map the token stream to its AST with respect to operator precedence:

Sub(
    Mul(
        Lit 10.0
        ,Add(Lit 4.0, Lit 5.0)
       )
    ,Lit 1.0
   )

但是,我正在绘制空白.我从来没有从头开始编写过解析器,即使从原则上来说我也不知道如何开始.

However, I'm drawing a blank. I've never written a parser from scratch, and I don't know even in principle how to begin.

如何转换令牌流及其代表的AST?

推荐答案

您了解语言语法吗?

假设是,您有一种语法,其中包含规则

Assuming yes, you have a grammar with rules along the lines

...
addTerm := mulTerm addOp addTerm
         | mulTerm

addOp   := XPlus | XMinus

mulTerm := litOrParen mulOp mulTerm
         | litOrParen
...

最终变成了类似的代码(在浏览器中编写代码,从未编译)

which ends up turning into code like (writing code in browser, never compiled)

let rec AddTerm() =
    let mulTerm = MulTerm() // will parse next mul term (error if fails to parse)
    match TryAddOp with  // peek ahead in token stream to try parse
    | None -> mulTerm    // next token was not prefix for addOp rule, stop here
    | Some(ao) ->        // did parse an addOp
         let rhsMulTerm = MulTerm()
         match ao with
         | XPlus -> Add(mulTerm, rhsMulTerm)
         | XMinus -> Sub(mulTerm, rhsMulTerm)
and TryAddOp() =
    let next = tokens.Peek() 
    match next with
    | XPlus | XMinus ->
        tokens.ConsumeNext()
        Some(next)
    | _ -> None
...

希望您能看到基本的想法.假设全局可变令牌流允许窥视下一个令牌"和消费下一个令牌".

Hopefully you see the basic idea. This assumes a global mutable token stream that allows both 'peek at next token' and 'consume next token'.

这篇关于如何从头开始编写递归下降解析器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆