Haskell中的类型化抽象语法和DSL设计 [英] Typed abstract syntax and DSL design in Haskell

查看:51
本文介绍了Haskell中的类型化抽象语法和DSL设计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Haskell中设计DSL,我想进行赋值操作.这样的事情(下面的代码仅用于在有限的上下文中解释我的问题,我没有类型检查过的Stmt类型):

I'm designing a DSL in Haskell and I would like to have an assignment operation. Something like this (the code below is just for explaining my problem in a limited context, I didn't have type checked Stmt type):

 data Stmt = forall a . Assign String (Exp a) -- Assignment operation
           | forall a. Decl String a          -- Variable declaration 
 data Exp t where
    EBool   :: Bool -> Exp Bool
    EInt    :: Int  -> Exp Int
    EAdd    :: Exp Int -> Exp Int -> Exp Int
    ENot    :: Exp Bool -> Exp Bool

在前面的代码中,我能够使用GADT对表达式实施类型约束.我的问题是如何强制分配的左侧是:1)已定义,即必须在使用变量之前声明变量,以及2)右侧必须具有与左侧变量相同类型的变量?

In the previous code, I'm able to use a GADT to enforce type constraints on expressions. My problem is how can I enforce that the left hand side of an assignment is: 1) Defined, i.e., a variable must be declared before it is used and 2) The right hand side must have the same type of the left hand side variable?

我知道,在完全依赖类型的语言中,我可以定义通过某种类型的键入上下文(即,已定义变量及其类型的列表)建立索引的语句.我相信这将解决我的问题.但是,我想知道Haskell是否有某种方法可以实现这一目标.

I know that in a full dependently typed language, I could define statements indexed by some sort of typing context, that is, a list of defined variables and their type. I believe that this would solve my problem. But, I'm wondering if there is some way to achieve this in Haskell.

任何赞赏示例代码或文章的指针.

Any pointer to example code or articles is highly appreciated.

推荐答案

鉴于我的工作重点是在类型级别对范围和类型安全性进行编码的相关问题,我偶然发现了这个古老的问题,同时在周围进行了搜索和以为我会尝试的.

Given that my work focuses on related issues of scope and type safety being encoded at the type-level, I stumbled upon this old-ish question whilst googling around and thought I'd give it a try.

我认为,这篇文章提供的答案非常接近原始规范.一旦设置正确,整个过程就很短.

This post provides, I think, an answer quite close to the original specification. The whole thing is surprisingly short once you have the right setup.

首先,我将从一个示例程序开始,让您大致了解最终结果:

First, I'll start with a sample program to give you an idea of what the end result looks like:

program :: Program
program = Program
  $  Declare (Var :: Name "foo") (Of :: Type Int)
  :> Assign  (The (Var :: Name "foo")) (EInt 1)
  :> Declare (Var :: Name "bar") (Of :: Type Bool)
  :> increment (The (Var :: Name "foo"))
  :> Assign  (The (Var :: Name "bar")) (ENot $ EBool True)
  :> Done

作用域

为了确保我们只能为之前声明的变量赋值,我们需要范围的概念.

Scoping

In order to ensure that we may only assign values to variables which have been declared before, we need a notion of scope.

GHC.TypeLits 为我们提供了类型级别的字符串(称为 Symbol ),因此如果需要,我们可以很好地使用字符串作为变量名.并且因为我们要确保类型安全,所以每个变量声明都带有一个类型注释,该注释将与变量名称一起存储.因此,我们的作用域类型为: [((Symbol,*)] .

GHC.TypeLits provides us with type-level strings (called Symbol) so we can very-well use strings as variable names if we want. And because we want to ensure type safety, each variable declaration comes with a type annotation which we will store together with the variable name. Our type of scopes is therefore: [(Symbol, *)].

我们可以使用类型族来测试给定的 Symbol 是否在范围内,并在这种情况下返回其关联的类型:

We can use a type family to test whether a given Symbol is in scope and return its associated type if that is the case:

type family HasSymbol (g :: [(Symbol,*)]) (s :: Symbol) :: Maybe * where
  HasSymbol '[]            s = 'Nothing
  HasSymbol ('(s, a) ': g) s = 'Just a
  HasSymbol ('(t, a) ': g) s = HasSymbol g s

通过此定义,我们可以定义变量的概念:作用域 g 中类型为 a 的变量是符号 s ,使得 HasSymbol gs 返回'Just a .这就是 ScopedSymbol 数据类型通过使用存在性量化来存储 s 的方式表示的.

From this definition we can define a notion of variable: a variable of type a in scope g is a symbol s such that HasSymbol g s returns 'Just a. This is what the ScopedSymbol data type represents by using an existential quantification to store the s.

data ScopedSymbol (g :: [(Symbol,*)]) (a :: *) = forall s.
  (HasSymbol g s ~ 'Just a) => The (Name s)

data Name (s :: Symbol) = Var

在这里,我故意在各处滥用符号: ScopedSymbol 类型的构造函数,而 Name Proxy 类型更好名称和构造函数.这样我们就可以写出如下内容:

Here I am purposefully abusing notations all over the place: The is the constructor for the type ScopedSymbol and Name is a Proxy type with a nicer name and constructor. This allows us to write such niceties as:

example :: ScopedSymbol ('("foo", Int) ': '("bar", Bool) ': '[]) Bool
example = The (Var :: Name "bar")

声明

现在我们有了作用域和该作用域中类型明确的变量的概念,我们可以开始考虑声明 Statement 应该具有的作用.鉴于可以在 Statement 中声明新变量,我们需要找到一种在范围内传播此信息的方法.关键的后见之明是拥有两个索引:输入输出范围.

Statements

Now that we have a notion of scope and of well-typed variables in that scope, we can start considering the effects Statements should have. Given that new variables can be declared in a Statement, we need to find a way to propagate this information in the scope. The key hindsight is to have two indices: an input and an output scope.

声明新变量及其类型,将使用变量名称和相应类型对扩展当前范围.

To Declare a new variable together with its type will expand the current scope with the pair of the variable name and the corresponding type.

分配不会修改作用域.它们只是将 ScopedSymbol 与相应类型的表达式相关联.

Assignments on the other hand do not modify the scope. They merely associate a ScopedSymbol to an expression of the corresponding type.

data Statement (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
  Declare :: Name s -> Type a -> Statement g ('(s, a) ': g)
  Assign  :: ScopedSymbol g a -> Exp g a -> Statement g g

data Type (a :: *) = Of

再次,我们引入了代理类型以具有更好的用户级语法.

Once again we have introduced a proxy type to have a nicer user-level syntax.

example' :: Statement '[] ('("foo", Int) ': '[])
example' = Declare (Var :: Name "foo") (Of :: Type Int)

example'' :: Statement ('("foo", Int) ': '[]) ('("foo", Int) ': '[])
example'' = Assign (The (Var :: Name "foo")) (EInt 1)

通过定义以下类型对齐的序列的GADT,可以将

Statement 链接为一种保留作用域的方式:

Statements can be chained in a scope-preserving way by defining the following GADT of type-aligned sequences:

infixr 5 :>
data Statements (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
  Done :: Statements g g
  (:>) :: Statement g h -> Statements h i -> Statements g i

表达式

表达式与您的原始定义几乎没有什么不同,除了它们现在已经确定了作用域,并且新的构造函数 EVar 使我们可以取消引用先前声明的变量(使用 ScopedSymbol ),从而为我们提供了适当类型的表达式.

Expressions

Expressions are mostly unchanged from your original definition except that they are now scoped and a new constructor EVar lets us dereference a previously-declared variable (using ScopedSymbol) giving us an expression of the appropriate type.

data Exp (g :: [(Symbol,*)]) (t :: *) where
  EVar    :: ScopedSymbol g a -> Exp g a
  EBool   :: Bool -> Exp g Bool
  EInt    :: Int  -> Exp g Int
  EAdd    :: Exp g Int -> Exp g Int -> Exp g Int
  ENot    :: Exp g Bool -> Exp g Bool

程序

Program 只是从空作用域开始的一系列语句.我们再次使用现有的量化方法来隐藏最终的范围.

Programs

A Program is quite simply a sequence of statements starting in the empty scope. We use, once more, an existential quantification to hide the scope we end up with.

data Program = forall h. Program (Statements '[] h)

显然可以在Haskell中编写子例程,然后在程序中使用它们.在示例中,我有一个非常简单的 increment ,可以这样定义:

It is obviously possible to write subroutines in Haskell and use them in your programs. In the example, I have the very simple increment which can be defined like so:

increment :: ScopedSymbol g Int -> Statement g g
increment v = Assign v (EAdd (EVar v) (EInt 1))

我已经上传了完整的代码片段以及正确的 LANGUAGE 编译指示和此处列出的示例

I have uploaded the whole code snippet together with the right LANGUAGE pragmas and the examples listed here in a self-contained gist. I haven't however included any comments there.

这篇关于Haskell中的类型化抽象语法和DSL设计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆