在Haskell中实现语言解释器 [英] Implementing a language interpreter in Haskell

查看:96
本文介绍了在Haskell中实现语言解释器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在Haskell中实现一个命令式的语言解释器(用于教育目的)。但是我很难为我的解释器创建正确的架构:我应该如何存储变量?我如何实现嵌套函数调用?我应该如何实现变量范围?如何在我的语言中添加调试可能性?我应该使用monads / monad变压器/其他技术吗?等等。



有没有人知道关于这个主题的优秀文章/论文/教程/资料来源?

解决方案

如果你对编写这类处理器不熟悉,我建议暂时放弃使用monads,并首先关注如何在没有任何花招或哨声的情况下实现准系统。 b

以下内容可能会用作minitutorial。



我假设您已经解决了解析所需程序的源文本的问题写一个解释器,并且你有一些类型来捕获你的语言的抽象语法。我在这里使用的语言非常简单,只有整数表达式和一些基本语句。



Preliminaries



让我们先导入一些我们稍后会用到的模块。

  import Data.Function 
import Data .List

命令式语言的本质是它具有某种形式的可变变量。这里,变量只是由字符串表示:

  type Var = String 



表达式



接下来,我们定义表达式。表达式由整型常量,变量引用和算术运算构成。

  infixl 6:+:,: - :
infixl 7:* :,:/:

数据Exp
= C Int - 常数
| V Var - 变量
| Exp:+:Exp - addition
| Exp: - :Exp - 减去
| Exp:*:Exp - 乘法
| Exp:/:Exp - 除法

例如,将常量2加到变量上的表达式 x Vx表示:+:C 2



语句



语句语言相当简单。我们有三种形式的陈述:变量赋值,循环和序列。

  infix 1:= 

数据stmt
= Var:= Exp - 赋值
| Exp Stmt - 循环
| Seq [stmt] - sequence

例如,用于交换变量 x y 可以用 Seq [tmp:= V x,x:= Vy,y:= Vtmp]



程序



程序只是一个声明。

  type Prog = Stmt 



商店



现在,让我们转到实际翻译。运行程序时,我们需要跟踪分配给程序中不同变量的值。这些值只是整数,作为我们记忆的表示,我们只是使用由变量和值组成的对的列表。

 类型Val = Int 
类型Store = [(Var,Val)]



评估表达式通过将常量映射到它们的值,查找存储器中变量的值以及将算术运算映射到它们的Haskell对应物来评估表达式。$ / $>



  eval :: Exp  - >商店 - > Val 
eval(C n)r = n
eval(V x)r =
的情况查找x r Nothing - >错误(未绑定的变量`++ x ++')
Just v - > v
eval(e1:+:e2)r = eval e1 r + eval e2 r
eval(e1: - :e2)r = eval e1 r - eval e2 r
eval(e1 :*:e2)r = eval e1 r * eval e2 r
eval(e1:/:e2)r = eval e1 r`div` eval e2 r
pre>

请注意,如果商店包含多个变量绑定, lookup 会选择第一个绑定存储。

执行语句



虽然表达式的评估不能改变存储的内容,但执行声明实际上可能会导致商店更新。因此,执行语句的函数将商店作为参数,并生成可能更新的商店。

  exec :: Stmt  - >商店 - >存储
exec(x:= e)r =(x,eval e r):r
exec(while e s)r | eval e r / = 0 = exec(Seq [s,While e s])r
|否则= r
exec(Seq [])r = r
exec(Seq(s:ss))r = exec(Seq ss)(exec sr)


顶级解释器



运行一个程序可以简化为执行顶级语句一个初始商店。

  run :: Prog  - >商店 - >存储
运行pr = nubBy((==)``fst)(exec pr)



<执行语句后,我们清理任何阴影绑定,以便我们可以轻松读取最终存储的内容。



示例



作为例子,考虑下面的程序计算存储在变量 n 中的数字的斐波那契数,并将其结果存储在变量 x

  fib :: Prog 
fib = Seq $ b $b=C 0
,y:= C 1
,while(Vn)$ Seq
[z:= Vx :+:Vy
,x:= Vy
,y:= Vz
,n:= Vn : - :C 1
]
]

例如,环境,我们现在可以使用我们的解释器来计算第25个斐波纳契数:

 > lookupx$ run fib [(n,25)] 
只需75025



Monadic Interpretation



当然,在这里,我们正在处理一个非常简单和微小的命令式语言。随着您的语言变得越来越复杂,您的口译员的实施也将越来越复杂。例如,考虑添加过程时需要添加哪些内容,并需要区分本地(基于堆栈)的存储和全局(基于堆)的存储。回到你的问题的这一部分,你可能确实会考虑引入monad来简化你的解释器的实现。



在上面的示例解释器中,有两种效应可以被一元结构捕获:


  1. 传递和更新商店。

  2. 遇到运行时错误时中止运行程序。 (在上面的实现中,当发生这样的错误时,解释器会崩溃。)


  3. 第一个效果通常由状态monad ,第二个由错误monad。让我们简单地研究如何为我们的解释器做到这一点。



    我们通过从标准库中导入更多模块来准备。

      import Control.Monad 

    我们可以使用monad变换器通过组合基本状态monad和基本错误monad来为我们的两种效果构建复合monad。在这里,我们只是简单地构建复合monad。

      newtype Interp a = Interp {runInterp :: Store  - > ; String(a,Store)} 

    实例Monad Interp其中
    return x = Interp $ \r - > Right(x,r)
    i>> = k = Interp $ \r - >
    的情况runInterp i r Left msg - >留下msg
    Right(x,r') - > runInterp(k x)r'
    fail msg = Interp $ \_ - >左msg

    为了读写商店,我们引入了有效的函数 rd wr

      rd: :Var  - > Interp Val 
    rd x = Interp $ \r - >案例查询x r of
    Nothing - >左(unbound变量`++ x ++')
    只是v - > Right(v,r)

    wr :: Var - > Val - > Interp()
    wr x v = Interp $ \r - >请注意,右(()如果变量查找失败,rd 产生一个 Left 包装的错误消息。



    单表达式求值器的单值版本现在读取

      eval :: Exp  - > Interp Val 
    eval(C n)= do返回n
    eval(V x)= do rd x
    eval(e1:+:e2)= do v1 < - eval e1
    v2 < - eval e2
    return(v1 + v2)
    eval(e1: - :e2)= do v1 < - eval e1
    v2 < - eval e2
    return(v1-v2)
    eval(e1:*:e2)= do v1 < - eval e1
    v2 < - eval e2
    return(v1 * v2)
    eval(e1:/:e2)= do v1 < - eval e1
    v2 < - eval e2
    if v2 == 0
    然后失败零除
    else return(v1`div` v2)

    :/:,除以零会导致通过 Monad - 方法失败,这对于 Interp 来说简化为将消息包装在 Left - 值中。



    为执行语句我们有

      exec :: Stmt  - > Interp()
    exec(x:= e)= do v < - eval e
    wr xv
    exec(While es)= do v < - eval e
    when (vq = 0)(exec(Seq [s,While es]))
    exec(Seq [])= do return()
    exec(Seq(s:ss))= do exec s
    exec(Seq ss)

    exec 表示这些语句不会导致值,但是只会对它们对商店的影响或它们可能触发的运行时错误执行。

    最后,在函数 run 中,我们执行一次性计算并处理其效果。

      run :: Prog  - >商店 - >任一字符串存储
    运行p r = case
    的runInterp(exec p)r Left msg - >左msg
    右(_,r') - >在交互式环境中,右键点击(nubBy((== ==``fst)r')

    我们现在可以重新审视我们的示例程序的解释:

     >查找x`fmap` run fib [(n,25)] 
    正确(仅75025)

    > lookupx`fmap` run fib []
    Leftunbound variable`n'


    I want to implement an imperative language interpreter in Haskell (for educational purposes). But it's difficult for me to create right architecture for my interpreter: How should I store variables? How can I implement nested function calls? How should I implement variable scoping? How can I add debugging possibilities in my language? Should I use monads/monad transformers/other techniques? etc.

    Does anybody know good articles/papers/tutorials/sources on this subject?

    解决方案

    If you are new to writing this kind of processors, I would recommend to put off using monads for a while and first focus on getting a barebones implementation without any bells or whistles.

    The following may serve as a minitutorial.

    I assume that you have already tackled the issue of parsing the source text of the programs you want to write an interpreter for and that you have some types for capturing the abstract syntax of your language. The language that I use here is very simple and only consists of integer expressions and some basic statements.

    Preliminaries

    Let us first import some modules that we will use in just a bit.

    import Data.Function
    import Data.List
    

    The essence of an imperative language is that it has some form of mutable variables. Here, variables simply represented by strings:

    type Var = String
    

    Expressions

    Next, we define expressions. Expressions are constructed from integer constants, variable references, and arithmetic operations.

    infixl 6 :+:, :-:
    infixl 7 :*:, :/:
    
    data Exp
      = C Int        -- constant                                                     
      | V Var        -- variable                                                     
      | Exp :+: Exp  -- addition                                                     
      | Exp :-: Exp  -- subtraction                                                  
      | Exp :*: Exp  -- multiplication                                               
      | Exp :/: Exp  -- division
    

    For example, the expression that adds the constant 2 to the variable x is represented by V "x" :+: C 2.

    Statements

    The statement language is rather minimal. We have three forms of statements: variable assignments, while loops, and sequences.

    infix 1 :=
    
    data Stmt
      = Var := Exp      -- assignment                                                
      | While Exp Stmt  -- loop                                                      
      | Seq [Stmt]      -- sequence
    

    For example, a sequence of statements for "swapping" the values of the variables x and y can be represented by Seq ["tmp" := V "x", "x" := V "y", "y" := V "tmp"].

    Programs

    A program is just a statement.

    type Prog = Stmt
    

    Stores

    Now, let us move to the actual interpreter. While running a program, we need to keep track of the values assigned to the different variables in the programs. These values are just integers and as a representation of our "memory" we just use lists of pairs consisting of a variable and a value.

    type Val = Int
    type Store = [(Var, Val)]
    

    Evaluating expressions

    Expressions are evaluated by mapping constants to their value, looking up the values of variables in the store, and mapping arithmetic operations to their Haskell counterparts.

    eval :: Exp -> Store -> Val
    eval (C n) r       = n
    eval (V x) r       = case lookup x r of
                           Nothing -> error ("unbound variable `" ++ x ++ "'")
                           Just v  -> v
    eval (e1 :+: e2) r = eval e1 r + eval e2 r
    eval (e1 :-: e2) r = eval e1 r - eval e2 r
    eval (e1 :*: e2) r = eval e1 r * eval e2 r
    eval (e1 :/: e2) r = eval e1 r `div` eval e2 r
    

    Note that if the store contains multiple bindings for a variable, lookup selects the bindings that comes first in the store.

    Executing statements

    While the evaluation of an expression cannot alter the contents of the store, executing a statement may in fact result in an update of the store. Hence, the function for executing a statement takes a store as an argument and produces a possibly updated store.

    exec :: Stmt -> Store -> Store
    exec (x := e) r                    = (x, eval e r) : r
    exec (While e s) r | eval e r /= 0 = exec (Seq [s, While e s]) r
                       | otherwise     = r
    exec (Seq []) r                    = r
    exec (Seq (s : ss)) r              = exec (Seq ss) (exec s r)
    

    Note that, in the case of assignments, we simply push a new binding for the updated variable to the store, effectively shadowing any previous bindings for that variable.

    Top-level Interpreter

    Running a program reduces to executing its top-level statement in the context of an initial store.

    run :: Prog -> Store -> Store
    run p r = nubBy ((==) `on` fst) (exec p r)
    

    After executing the statement we clean up any shadowed bindings, so that we can easily read off the contents of the final store.

    Example

    As an example, consider the following program that computes the Fibonacci number of the number stored in the variable n and stores its result in the variable x.

    fib :: Prog
    fib = Seq
      [ "x" := C 0
      , "y" := C 1
      , While (V "n") $ Seq
          [ "z" := V "x" :+: V "y"
          , "x" := V "y"
          , "y" := V "z"
          , "n" := V "n" :-: C 1
          ]
      ]
    

    For instance, in an interactive environment, we can now use our interpreter to compute the 25th Fibonacci number:

    > lookup "x" $ run fib [("n", 25)]
    Just 75025
    

    Monadic Interpretation

    Of course, here, we are dealing with a very simple and tiny imperative language. As your language gets more complex, so will the implementation of your interpreter. Think for example about what additions you need when you add procedures and need to distinguish between local (stack-based) storage and global (heap-based) storage. Returning to that part of your question, you may then indeed consider the introduction of monads to streamline the implementation of your interpreter a bit.

    In the example interpreter above, there are two "effects" that are candidates for being captured by a monadic structure:

    1. The passing around and updating of the store.
    2. Aborting running the program when a run-time error is encountered. (In the implementation above, the interpreter simply crashes when such an error occurs.)

    The first effect is typically captured by a state monad, the second by an error monad. Let us briefly investigate how to do this for our interpreter.

    We prepare by importing just one more module from the standard libraries.

    import Control.Monad
    

    We can use monad transformers to construct a composite monad for our two effects by combining a basic state monad and a basic error monad. Here, however, we simply construct the composite monad in one go.

    newtype Interp a = Interp { runInterp :: Store -> Either String (a, Store) }
    
    instance Monad Interp where
      return x = Interp $ \r -> Right (x, r)
      i >>= k  = Interp $ \r -> case runInterp i r of
                   Left msg      -> Left msg
                   Right (x, r') -> runInterp (k x) r'
      fail msg = Interp $ \_ -> Left msg
    

    For reading from and writing to the store, we introduce effectful functions rd and wr:

    rd :: Var -> Interp Val
    rd x = Interp $ \r -> case lookup x r of
             Nothing -> Left ("unbound variable `" ++ x ++ "'")
             Just v  -> Right (v, r)
    
    wr :: Var -> Val -> Interp ()
    wr x v = Interp $ \r -> Right ((), (x, v) : r)
    

    Note that rd produces a Left-wrapped error message if a variable lookup fails.

    The monadic version of the expression evaluator now reads

    eval :: Exp -> Interp Val
    eval (C n)       = do return n
    eval (V x)       = do rd x
    eval (e1 :+: e2) = do v1 <- eval e1
                          v2 <- eval e2
                          return (v1 + v2)
    eval (e1 :-: e2) = do v1 <- eval e1
                          v2 <- eval e2
                          return (v1 - v2)
    eval (e1 :*: e2) = do v1 <- eval e1
                          v2 <- eval e2
                          return (v1 * v2)
    eval (e1 :/: e2) = do v1 <- eval e1
                          v2 <- eval e2
                          if v2 == 0
                            then fail "division by zero"
                            else return (v1 `div` v2)
    

    In the case for :/:, division by zero results in an error message being produced through the Monad-method fail, which, for Interp, reduces to wrapping the message in a Left-value.

    For the execution of statements we have

    exec :: Stmt -> Interp ()
    exec (x := e)       = do v <- eval e
                             wr x v
    exec (While e s)    = do v <- eval e
                             when (v /= 0) (exec (Seq [s, While e s]))
    exec (Seq [])       = do return ()
    exec (Seq (s : ss)) = do exec s
                             exec (Seq ss)
    

    The type of exec conveys that statements do not result in values but are executed only for their effects on the store or the run-time errors they may trigger.

    Finally, in the function run we perform a monadic computation and process its effects.

    run :: Prog -> Store -> Either String Store
    run p r = case runInterp (exec p) r of
                Left msg      -> Left msg
                Right (_, r') -> Right (nubBy ((==) `on` fst) r')
    

    In the interactive environment, we can now revisit the interpretation of our example program:

    > lookup "x" `fmap` run fib [("n", 25)]
    Right (Just 75025)
    
    > lookup "x" `fmap` run fib []
    Left "unbound variable `n'"
    

    这篇关于在Haskell中实现语言解释器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆