Haskell 中的动态调度 [英] Dynamic dispatch in Haskell

查看:33
本文介绍了Haskell 中的动态调度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如用 Java 编写的程序在很大程度上依赖于动态调度.

Programs written in, for example, Java rely a lot on dynamic dispatch.

这些程序如何用 Haskell 等函数式语言表达?

How are such programs expressed in functional languages such as Haskell?

换句话说,Haskell 表达动态调度"思想的方式是什么?

In other words, what is the Haskell way of expressing the idea underneath "dynamic dispatch"?

推荐答案

答案看似简单:高阶函数.OO 语言中具有虚方法的对象只不过是功能的美化记录以及一些本地状态.在 Haskell 中,您可以直接使用函数的记录,并将本地状态存储在它们的闭包中.

The answer is deceptively simple: higher-order functions. An object with virtual methods in an OO language is nothing more than a glorified record of functions together with some local state. In Haskell you can use records of functions directly, and store the local state in their closures.

更具体地说,一个面向对象的对象包括:

More concretely, an OO object consists of:

  • 指向 vtable(虚拟方法表)的指针 (vptr),其中包含对象类的虚拟方法的实现.换句话说,一堆函数指针;功能记录.值得注意的是,每个函数都有一个隐藏参数,即对象本身,该参数是隐式传递的.
  • 对象的数据成员(本地状态)

很多时候,由于缺乏对闭包的支持,整个对象和虚函数大厦感觉就像是一个精心设计的解决方法.

A lot of the time the whole edifice of objects and virtual functions feels like an elaborate workaround for the lack of support for closures.

例如考虑Java的Comparator接口:

For example consider Java's Comparator interface:

public interface Comparator<T> {
    int compare(T o1, T o2); // virtual (per default)
}

并且假设您想使用它根据字符串的第 N 个字符对字符串列表进行排序(假设它们足够长).您将定义一个类:

And suppose you want to use it to sort a list of strings based on the strings' Nth characters (assume they are long enough). You would define a class:

public class MyComparator implements Comparator<String> {
    private final int _n;

    MyComparator(int n) {
        _n = n;
    }

    int compare(String s1, String s2) {
        return s1.charAt(_n) - s2.charAt(_n);
    }
}

然后你使用它:

Collections.sort(myList, new MyComparator(5));      

在 Haskell 中,你会这样做:

In Haskell you would do it like this:

sortBy :: (a -> a -> Ordering) -> [a] -> [a]

myComparator :: Int -> (String -> String -> Ordering)
myComparator n = s1 s2 -> (s1 !! n) `compare` (s2 !! n)
-- n is implicitly stored in the closure of the function we return

foo = sortBy (myComparator 5) myList

如果您不熟悉 Haskell,下面是它在一种伪 Java 中的大致外观:(我只会这样做一次)

If you're not familiar with Haskell, here's how it would roughly look in a kind of pseudo-Java: (I'm only going to do this once)

public void <T> sortBy(List<T> list, Ordering FUNCTION(T, T) comparator) { ... }

public (Ordering FUNCTION(String, String)) myComparator(int n) {
    return FUNCTION(String s1, String s2) {
        return s1[n].compare(s2[n]);
    }
}

public void foo() {
    sortBy(myList, myComparator(5));
}

请注意,我们没有定义任何类型.我们使用的都是函数.在这两种情况下,我们传递给 sort 函数的有效负载"都是一个函数,它接受两个元素并给出它们的相对顺序.在一种情况下,这是通过定义实现接口的类型,以适当的方式实现其虚函数,并传递该类型的对象来实现的;在另一种情况下,我们只是直接传递了一个函数.在这两种情况下,我们都在传递给 sort 函数的事物中存储了一个内部整数.在一种情况下,这是通过向我们的类型添加一个私有数据成员来完成的,在另一种情况下,通过在我们的函数中简单地引用它,使其保留在函数的闭包中.

Note that we did not define any types. All we used are functions. In both cases the "payload" that we passed to the sort function was a function that takes two elements and gives their relative ordering. In one case this was accomplished by defining a type implementing an interface, implementing its virtual function in the appropriate way, and passing an object of that type; in the other case we just passed a function directly. In both cases we stored an internal integer in the thing we passed to the sort function. In one case this was done by adding a private data member to our type, in the other by simply referring to it in our function, causing it to be retained in the function's closure.

考虑带有事件处理程序的小部件的更详细示例:

Consider the more elaborate example of a widget with event handlers:

public class Widget {
    public void onMouseClick(int x, int y) { }
    public void onKeyPress(Key key) { }
    public void paint() { }
    ...
}

public class MyWidget extends Widget {
    private Foo _foo;
    private Bar _bar;
    MyWidget(...) {
        _foo = something;
        _bar = something; 
    }
    public void onMouseClick(int x, int y) {
        ...do stuff with _foo and _bar...
    }
}

在 Haskell 中,你可以这样做:

In Haskell you could do it like this:

data Widget = Widget {
    onMouseClick :: Int -> Int -> IO (),
    onKeyPress   :: Key -> IO (),
    paint        :: IO (),
    ...
}

constructMyWidget :: ... -> IO Widget
constructMyWidget = do
    foo <- newIORef someFoo
    bar <- newIORef someBar
    return $ Widget {
        onMouseClick = x y -> do
            ... do stuff with foo and bar ...,
        onKeyPress = key -> do ...,
        paint = do ...
    }

再次注意,在最初的 Widget 之后,我们没有定义任何类型.我们只写了一个函数来构造函数的记录并将东西存储在它们的闭包中.大多数时候,这也是在 OO 语言中定义子类的唯一原因.与我们之前的示例的唯一区别是,不是一个函数,而是多个函数,在 Java 情况下,通过简单地将多个函数放入接口(及其实现)中来编码,而在 Haskell 中则通过传递函数记录而不是单一功能.(我们可以在前面的例子中传递一个包含单个函数的记录,但我们不喜欢它.)

Note again that after the initial Widget, we did not define any types. We only wrote a function to construct a record of functions and store things in their closures. Which, most of the time, is also the only reason to define a subclass in an OO language. The only difference from our previous example is that instead of one function there's multiple, which in the Java case is encoded by simply putting more than one function in the interface (and its implementations), and in Haskell by passing a record of functions instead of a single function. (We could have passed a record containing a single function in the previous example, but we didn't feel like it.)

(值得注意的是,很多时候您不需要动态调度.如果您只想根据类型的默认排序对列表进行排序,那么您只需使用 sort :: Ord a => [a] -> [a],它使用为给定的 a 定义的 Ord 实例类型,静态选择.)

(It's worth noting somewhere that a lot of the time you don't need dynamic dispatch. If you just wanted to sort a list based on the default ordering for the type, then you would simply use sort :: Ord a => [a] -> [a], which uses the Ord instance defined for the given a type, which is selected statically.)

Java 方法和上面的 Haskell 方法之间的一个区别是,在 Java 方法中,对象的行为(除了它的本地状态)由它的类型决定(或者不太友好,每个实现都需要一个新类型).在 Haskell 中,我们以我们喜欢的方式记录函数.大多数时候这是一个纯粹的胜利(获得了灵活性,没有损失),但假设出于某种原因我们希望它采用 Java 方式.在这种情况下,正如其他答案所提到的那样,要走的路是类型类和存在性.

One difference between the Java approach and the Haskell approach above is that with the Java approach, the behaviour of an object (except for its local state) is determined by its type (or less charitably, each implementation requires a new type). In Haskell we're making our records of functions however way we like. Most of the time this is a pure win (flexibility gained, nothing lost), but suppose that for some reason we want it the Java way. In that case the way to go, as mentioned by other answers, is type classes and existentials.

继续我们的 Widget 示例,假设我们希望 Widget 的实现遵循其类型(每个实现都需要一个新类型).我们定义一个类型类来将类型映射到它的实现:

To continue our Widget example, suppose that we want the implementation of a Widget to follow from its type (to require a new type for each implementation). We define a type class to map the type to its implementation:

-- the same record as before, we just gave it a different name
data WidgetImpl = WidgetImpl {
    onMouseClick :: Int -> Int -> IO (),
    onKeyPress   :: Key -> IO (),
    paint        :: IO (),
    ...
}

class IsWidget a where
    widgetImpl :: a -> WidgetImpl

data Widget = forall a. IsWidget a => Widget a

sendClick :: Int -> Int -> Widget -> IO ()
sendClick x y (Widget a) = onMouseClick (widgetImpl a) x y

data MyWidget = MyWidget {
    foo :: IORef Foo,
    bar :: IORef Bar
}

constructMyWidget :: ... -> IO MyWidget
constructMyWidget = do
    foo_ <- newIORef someFoo
    bar_ <- newIORef someBar
    return $ MyWidget {
        foo = foo_,
        bar = bar_
    }

instance IsWidget MyWidget where
    widgetImpl myWidget = WidgetImpl {
            onMouseClick = x y -> do
                ... do stuff with (foo myWidget) and (bar myWidget) ...,
            onKeyPress = key -> do ...,
            paint = do ...
        }

我们有一个类只是为了获取函数的记录,然后我们必须单独取出函数,这有点尴尬.我这样做只是为了说明类型类的不同方面:它们也只是美化的函数记录(我们在下面使用)以及一些魔术,其中编译器根据推断类型插入适当的记录(我们在上面使用),并在下面继续使用).让我们简化一下:

It's a bit awkward that we have a class only to get a record of functions, which we then have to take the functions out of separately. I only did it this way to illustrate separate aspects of type classes: they're also just glorified records of functions (which we use below) together with some magic where the compiler inserts the appropriate record based on the inferred type (which we use above, and continue using below). Let's simplify:

class IsWidget a where
    onMouseClick :: Int -> Int -> a -> IO ()
    onKeyPress   :: Key -> a -> IO ()
    paint        :: a -> IO ()
    ...

instance IsWidget MyWidget where
    onMouseClick x y myWidget = ... do stuff with (foo myWidget) and (bar myWidget) ...
    onKeyPress key myWidget = ...
    paint myWidget = ...

sendClick :: Int -> Int -> Widget -> IO ()
sendClick x y (Widget a) = onMouseClick x y a

-- the rest is unchanged from above

这种风格经常被来自面向对象语言的人采用,因为它更熟悉,更接近面向对象语言的一对一映射方式.但是对于大多数用途,它只是比第一部分中描述的方法更复杂,更不灵活.原因是,如果各种 Widget 唯一重要的事情是它们实现小部件功能的方式,那么创建类型、这些类型的接口实例,然后通过将它们放入其中来再次抽象出底层类型就没有什么意义了存在性包装器:直接传递函数更简单.

This style is often adopted by people coming from OO languages, because it's more familiar and closer to a one-for-one mapping from the way OO languages do it. But for most purposes it's just more elaborate and less flexible than the approach described in the first section. The reason is that if the only significant thing about the various Widgets is the way they implement the widget functions, then there's little point in making types, instances of the interface for those types, and then abstracting away the underlying type again by putting them in an existential wrapper: it's simpler to just pass the functions around directly.

可以想到的一个优点是,虽然 Haskell 没有子类型化,但它确实有子类化"(可能更好地称为子接口或子约束).例如你可以这样做:

One advantage I can think of is that while Haskell doesn't have subtyping, it does have "subclassing" (probably better referred to as sub-interfacing or sub-constraining). For example you could do:

class IsWidget a => IsWidgetExtra a where
    ...additional methods to implement...

然后对于您拥有 IsWidgetExtra 的任何类型,您还可以无缝地使用 IsWidget 的方法.基于记录的方法的唯一替代方法是使用记录中的记录,这涉及内部记录的一些手动包装和展开.但这只有在您想显式地模拟 OO 语言的深层类层次结构时才有好处,而这反过来又只有在您想让自己变得困难时才会这样做.(另请注意,Haskell 没有任何内置方法可以将 IsWidget 动态向下转换为 IsWidgetExtra.但是有 ifcxt)

and then with any type for which you have IsWidgetExtra, you can also use the methods of IsWidget seamlessly. The only alternative with the record-based approach is to have records-within-records, which involves some manual wrapping-and-unwrapping of the inner records. But this would only be advantageous if you wanted to explicitly emulate the deep class hierarchy of an OO language, which in turn you would only do if you wanted to make life difficult for yourself. (Note also that Haskell doesn't have any built-in way to dynamically down-cast from IsWidget to IsWidgetExtra. But there is ifcxt)

(基于记录的方法有什么优点?除了不需要每次做新的事情都定义一个新的类型,记录是简单的值级的东西,值比类型更容易操作. 例如,您可以编写一个函数,将 Widget 作为参数,并基于它创建一个新的 Widget,其中一些不同,其他保持不变.这有点像从 C++ 中的模板参数子类化,只是不那么令人困惑.)

(What about the advantages of the record-based approach? Besides not having to define a new type every time you want to do a new thing, records are simple value-level things, and values are much easier to manipulate than types. You could, for example, write a function that takes a Widget as an argument and makes a new Widget based on it, with some things different and others kept the same. This is sort of like subclassing from a template parameter in C++, just less confusing.)

  • 高阶函数:将其他函数作为参数(或将它们作为结果返回)的函数

  • Higher-order function: a function that takes other functions as arguments (or returns them as results)

记录:struct(具有公共数据成员的类,没有别的).也称为字典.

Record: struct (class with public data members and nothing else). Also known as a dictionary.

闭包:函数式语言(以及许多其他语言)允许您定义局部函数(函数内的函数、lambdas),这些函数在定义站点引用作用域内的事物(例如,外部函数的参数)您通常不希望被保留,而是在函数的关闭"中.或者,如果你有一个像 plus 这样的函数,它接受两个整数并返回一个整数,你可以将它应用于一个参数,比如 5,结果将是接受一个 int 并返回一个 int 的函数,通过向它添加 5 - 在这种情况下, 5 也存储在结果函数的闭包中.(在其他上下文中,闭包"有时也表示带有闭包的函数".)

Closure: Functional languages (and many others) allow you to define local functions (functions within functions, lambdas) that make reference to things in scope at the definition site (for example, the arguments of the outer function) that you would normally expect not to be kept around, but are, in the function's "closure". Alternately, if you have a function like plus that takes two ints and returns an int, you can apply it to just one argument, say 5, and the result will be a function that takes an int and returns an int, by adding 5 to it - in that case the 5 is also stored in the resulting function's closure. (In other contexts "closure" is also sometimes used to mean "a function with a closure".)

类型类:与面向对象语言中的类相同.有点像一个界面,但也非常不同.参见此处.

Type class: not the same as a class in an OO language. Sort of like an interface, but also very different. See here.

编辑 29-11-14:虽然我认为这个答案的核心本质上仍然是正确的(Haskell 中的 HOF 对应于 OOP 中的虚拟方法),但自从我编写它以来,我的价值判断已经变得有些细微差别.特别是,我现在认为 Haskell 和 OOP 的方法都不是严格意义上的更基本的".请参阅此reddit评论.

EDIT 29-11-14: While I think the kernel of this answer is still essentially correct (HOFs in Haskell correspond to virtual methods in OOP), my value judgements have grown some nuance since I wrote it. In particular, I now think that neither Haskell's nor OOP's approach is strictly "more fundamental" than the other. See this reddit comment.

这篇关于Haskell 中的动态调度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆