什么是多态lambda? [英] What is a polymorphic lambda?

查看:106
本文介绍了什么是多态lambda?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

lambdas(匿名函数)的概念对我来说很清楚.而且我知道在类方面具有多态性,运行时/动态分派用于根据实例的最大派生类型来调用适当的方法.但是,lambda究竟如何是多态的呢?我还是另一个Java程序员,试图学习有关函数式编程的更多信息.

The concept of lambdas (anonymous functions) is very clear to me. And I'm aware of polymorphism in terms of classes, with runtime/dynamic dispatch used to call the appropriate method based on the instance's most derived type. But how exactly can a lambda be polymorphic? I'm yet another Java programmer trying to learn more about functional programming.

推荐答案

在以下答案中,您会发现我对lambda的讨论不多.请记住,在函数式语言中,任何函数都是绑定到名称的lambda,所以我所说的函数会转换为lambda.

请注意,多态实际上并不需要OO语言通过派生类覆盖虚拟方法来实现的调度".那只是一种特殊的多态性, subtyping .

Note that polymorphism doesn't really require the kind of "dispatch" that OO languages implement through derived classes overriding virtual methods. That's just one particular kind of polymorphism, subtyping.

多态性本身只是意味着一个函数不仅允许一种特定类型的参数,而且还可以针对任何一种允许的类型采取相应的行动.最简单的例子:您根本不关心类型,而只是处理传入的内容.或者,要使其变得不那么琐碎,请将其包装在单个元素的容器中.您可以在C ++中实现这样的功能:

Polymorphism itself simply means a function allows not just for one particular type of argument, but is able to act accordingly for any of the allowed types. The simplest example: you don't care for the type at all, but simply hand on whatever is passed in. Or, to make it not quite so trivial, wrap it in a single-element container. You could implement such a function in, say, C++:

template<typename T> std::vector<T> wrap1elem( T val ) {
  return std::vector(val);
}

但是您不能将其作为lambda实现,因为C ++(撰写时间:C ++ 11 )不支持多态lambda.

but you couldn't implement it as a lambda, because C++ (time of writing: C++11) doesn't support polymorphic lambdas.

...至少不是这样. C ++模板以一种非常不寻常的方式实现了多态性:实际上,编译器会在遇到的所有代码中,为任何传递给该函数的每种类型生成一个单态函数.由于C ++的值语义,这是必需的:传入值时,编译器需要知道确切的类型(其在内存中的大小,可能的子节点等),以便创建一个它的副本.

...At least not in this way, that is. C++ templates implement polymorphism in rather an unusual way: the compiler actually generates a monomorphic function for every type that anybody passes to the function, in all the code it encounters. This is necessary because of C++' value semantics: when a value is passed in, the compiler needs to know the exact type (its size in memory, possible child-nodes etc.) in order to make a copy of it.

在大多数较新的语言中,几乎所有内容都只是对某个值的引用,当您调用函数时,它不会获得参数对象的副本,而只是获得对已经存在的-现有的.较早的语言要求您将参数明确标记为引用/指针类型.

In most newer languages, almost everything is just a reference to some value, and when you call a function it doesn't get a copy of the argument objects but just a reference to the already-existing ones. Older languages require you to explicitly mark arguments as reference / pointer types.

引用语义的一大优点是,多态变得容易得多:指针始终具有相同的大小,因此同一机器代码完全可以处理对任何类型的引用.十分难看地 1 ,即使在C语言中,也可以进行多态容器包装:

A big advantage of reference semantics is that polymorphism becomes much easier: pointers always have the same size, so the same machine code can deal with references to any type at all. That makes, very uglily1, a polymorphic container-wrapper possible even in C:

typedef struct{
  void** contents;
  int size;
} vector;

vector wrap1elem_by_voidptr(void* ptr) {
  vector v;
  v.contents = malloc(sizeof(&ptr));
  v.contents[0] = ptr;
  v.size = 1;
  return v;
}
#define wrap1elem(val) wrap1elem_by_voidptr(&(val))

在这里,void*只是指向任何未知类型的指针.因此出现了明显的问题:vector不知道它包含"什么类型的元素!因此,您实际上无法对这些对象做任何有用的事情. 除非您确实知道它是什么类型

Here, void* is just a pointer to any unknown type. The obvious problem thus arising: vector doesn't know what type(s) of elements it "contains"! So you can't really do anything useful with those objects. Except if you do know what type it is!

int sum_contents_int(vector v) {
  int acc = 0, i;
  for(i=0; i<v.size; ++i) {
    acc += * (int*) (v.contents[i]);
  }
  return acc;
}

显然,这非常费力.如果类型是double怎么办?如果我们想要 product 而不是总和怎么办?当然,我们可以手工编写每个案例.这不是一个很好的解决方案.

obviously, this is extremely laborious. What if the type is double? What if we want the product, not the sum? Of course, we could write each case by hand. Not a nice solution.

如果我们有一个通用的函数来接受该指令做什么作为额外的参数,那将是更好的选择! C具有功能指针:

What would we better is if we had a generic function that takes the instruction what to do as an extra argument! C has function pointers:

int accum_contents_int(vector v, void* (*combine)(int*, int)) {
  int acc = 0, i;
  for(i=0; i<v.size; ++i) {
    combine(&acc, * (int*) (v.contents[i]));
  }
  return acc;
}

然后可以像

void multon(int* acc, int x) {
  acc *= x;
}
int main() {
  int a = 3, b = 5;
  vector v = wrap2elems(a, b);
  printf("%i\n", accum_contents_int(v, multon));
}

除了仍然很麻烦之外,上述所有C代码都存在一个巨大的问题:对于容器元素实际上是否具有正确的类型,它是完全不受检查的!来自*void的强制类型转换可以愉快地在任何类型上触发,但无疑会导致完全垃圾 2 .

Apart from still being cumbersome, all the above C code has one huge problem: it's completely unchecked if the container elements actually have the right type! The casts from *void will happily fire on any type, but in doubt the result will be complete garbage2.

该问题是OO语言试图通过将对象中可能正确执行的所有操作与对象作为方法捆绑在一起而解决的主要问题之一.在编译类时,类型是单态的,因此编译器可以检查操作是否有意义.当您尝试使用这些值时,如果编译器知道如何找到方法,就足够了.特别是,如果您创建派生类,则编译器会知道啊哈,即使在派生对象上,也可以从基类中调用该方法".

That problem is one of the main issues which OO languages solve by trying to bundle all operations you might perform right together with the data, in the object, as methods. While compiling your class, the types are monomorphic so the compiler can check the operations make sense. When you try to use the values, it's enough if the compiler knows how to find the method. In particular, if you make a derived class, the compiler knows "aha, it's ok to call that method from the base class even on a derived object".

不幸的是,这意味着您通过多态性所获得的一切都等同于合成数据并仅在单个字段上调用(单态)方法.为了针对不同类型实际获得不同的行为(但受控!),OO语言需要虚拟方法.基本上,这意味着该类具有带有指向方法实现的指针的额外字段,非常类似于我在C示例中使用的指向combine函数的指针–区别在于您只能通过添加一个派生类来实现重写方法,编译器将再次为该类知道所有数据字段的类型等,因此您很安全.

Unfortunately, that would mean all you achieve by polymorphism is equivalent to compositing data and simply calling the (monomorphic) methods on a single field. To actually get different behaviour (but controlledly!) for different types, OO languages need virtual methods. What this amounts to is basically that the class has extra fields with pointers to the method implementations, much like the pointer to the combine function I used in the C example – with the difference that you can only implement an overriding method by adding a derived class, for which the compiler again knows the type of all the data fields etc. and you're safe and all.

虽然基于继承的多态性显然可以工作,但我不禁说这是只是疯狂的愚蠢 3 确实有一定的局限性.如果只想使用一个碰巧没有实现为类方法的特定操作,则需要制作一个完整的派生类.即使您只想以某种方式 操作,您也需要派生并覆盖该方法的稍有不同的版本.

While inheritance-based polymorphism obviously works, I can't help saying it's just crazy stupid3 sure a bit limiting. If you want to use just one particular operation that happens to be not implemented as a class method, you need to make an entire derived class. Even if you just want to vary an operation in some way, you need to derive and override a slightly different version of the method.

让我们重新看看我们的C代码.从表面上看,我们注意到它应该完全有可能成为类型安全的,而没有任何方法捆绑的废话.我们只需要确保没有类型信息丢失–至少在编译期间不会丢失.想象(将∀T读为对于所有类型T")

Let's revisit our C code. On the face of it, we notice it should be perfectly possible to make it type-safe, without any method-bundling nonsense. We just need to make sure no type information is lost – not during compile-time, at least. Imagine (Read ∀T as "for all types T")

∀T: {
  typedef struct{
    T* contents;
    int size;
  } vector<T>;
}

∀T: {
  vector<T> wrap1elem(T* elem) {
    vector v;
    v.contents = malloc(sizeof(T*));
    v.contents[0] = &elem;
    v.size = 1;
    return v;
  }
}

∀T: {
  void accum_contents(vector<T> v, void* (*combine)(T*, const T*), T* acc) {
    int i;
    for(i=0; i<v.size; ++i) {
      combine(&acc, (*T) (v[i]));
    }
  }
}

即使签名看起来很像这篇文章顶部的C ++模板(我已经说过,实际上只是自动生成的单态代码),实际上,> implementation 只是普通的C.那里没有T值,只是指向它们的指针.无需编译代码的多个版本:在 runtime 处,不需要类型信息,我们只需处理通用指针即可. 在编译时,我们确实知道类型,并且可以使用函数头来确保它们匹配.也就是说,如果您写过

Observe how, even though the signatures look a lot like the C++ template thing on top of this post (which, as I said, really is just auto-generated monomorphic code), the implementation actually is pretty much just plain C. There are no T values in there, just pointers to them. No need to compile multiple versions of the code: at runtime, the type information isn't needed, we just handle generic pointers. At compile time, we do know the types and can use the function head to make sure they match. I.e., if you wrote

void evil_sumon (int* acc, double* x) { acc += *x; }

并试图做

vector<float> v; char acc;
accum_contents(v, evil_sumon, acc);

编译器会抱怨,因为类型不匹配:在accum_contents的声明中,它指出类型可能有所不同,,但是所有出现的T都需要解析为相同的类型.

the compiler would complain because the types don't match: in the declaration of accum_contents it says the type may vary, but all occurences of T do need to resolve to the same type.

这正是ML系列以及Haskell语言中的参数多态性的工作方式:函数实际上对它们所处理的多态数据一无所知.但是他们被赋予了专门的操作员,他们具有这样的知识,作为参数.

And that is exactly how parametric polymorphism works in languages of the ML family as well as Haskell: the functions really don't know anything about the polymorphic data they're dealing with. But they are given the specialised operators which have this knowledge, as arguments.

在像Java(在lambdas之前)这样的语言中,参数多态性不会给您带来什么好处:由于编译器故意定义仅是一个简单的辅助函数",而仅具有类方法,因此很难定义它,因此您可以简单地立即采取从类派生"的方式.但是在函数式语言中,定义小的辅助函数是最容易想到的事情:lambdas!

In a language like Java (prior to lambdas), parametric polymorphism doesn't gain you much: since the compiler makes it deliberately hard to define "just a simple helper function" in favour of having only class methods, you can simply go the derive-from-class way right away. But in functional languages, defining small helper functions is the easiest thing imaginable: lambdas!

因此,您可以在Haskell中执行令人难以置信的简洁代码:

And so you can do incredible terse code in Haskell:

Prelude>文件夹(+)0 [1,4,6]
11
Prelude>文件夹(\ x y-> x + y + 1)0 [1,4,6]
14
前奏> let f start = foldr(\ _(xl,xr)->(xr,xl))开始
前奏曲>:t f
f ::(t,t)-> [a]->(t,t)
前奏> f(左",右")[1]
(右",左")
前奏> f(左",右")[1、2]
(左",右")

Prelude> foldr (+) 0 [1,4,6]
11
Prelude> foldr (\x y -> x+y+1) 0 [1,4,6]
14
Prelude> let f start = foldr (\_ (xl,xr) -> (xr, xl)) start
Prelude> :t f
f :: (t, t) -> [a] -> (t, t)
Prelude> f ("left", "right") [1]
("right","left")
Prelude> f ("left", "right") [1, 2]
("left","right")

请注意,在我定义为f的辅助函数的lambda中,我对xlxr的类型一无所知,我只想交换这些元素的元组,而这需要类型为相同.这样将是一个多态lambda,其类型为

Note how in the lambda I defined as a helper for f, I didn't have any clue about the type of xl and xr, I merely wanted to swap a tuple of these elements which requires the types to be the same. So that would be a polymorphic lambda, with the type

\_ (xl, xr) -> (xr, xl)   ::  ∀ a t.  a -> (t,t) -> (t,t)


1 除了奇怪的显式malloc内容,类型安全性等之外:像这样的代码在没有垃圾收集器的语言中很难使用,因为有人总是需要清理一次内存不再需要它了,但是如果您没有适当地注意是否有人仍然持有该数据的引用,并且实际上可能仍然需要它.在Java,Lisp,Haskell中,您无需担心...


1Apart from the weird explicit malloc stuff, type safety etc.: code like that is extremely hard to work with in languages without garbage collector, because somebody always needs to clean up memory once it's not needed anymore, but if you didn't watch out properly whether somebody still holds a reference to the data and might in fact need it still. That's nothing you have to worry about in Java, Lisp, Haskell...

2 有一种完全不同的方法:选择一种动态语言.在这些语言中,每项操作都需要确保它适用于任何类型(或者,如果不可能的话,会引发明确定义的错误).然后,您可以任意地编写多态操作,一方面,它非常麻烦"(虽然不像像Haskell这样的真正聪明的类型系统那样麻烦),但是OTOH会产生相当大的开销,因为即使原始操作也需要类型决定和周围的保障措施.

2There is a completely different approach to this: the one dynamic languages choose. In those languages, every operation needs to make sure it works with any type (or, if that's not possible, raise a well-defined error). Then you can arbitrarily compose polymorphic operations, which is on one hand "nicely trouble-free" (not as trouble-free as with a really clever type system like Haskell's, though) but OTOH incurs quite a heavy overhead, since even primitive operations need type-decisions and safeguards around them.

3 我在这里当然不公平. OO范式不仅具有类型安全的多态性,还具有许多其他功能,例如:带有Hindler-Milner类型系统的旧ML无法做到(即席多态性:Haskell具有该类型的类,SML具有模块),甚至有些在Haskell中非常困难的事情(主要是在其中存储不同类型的值)可变大小的容器).但是,您对函数式编程的习惯越多,对这些东西的需求就会越少.

3I'm of course being unfair here. The OO paradigm has more to it than just type-safe polymorphism, it enables many things e.g. old ML with it's Hindler-Milner type system couldn't do (ad-hoc polymorphism: Haskell has type classes for that, SML has modules), and even some things that are pretty hard in Haskell (mainly, storing values of different types in a variable-size container). But the more you get accustomed to functional programming, the less need you will feel for such stuff.

这篇关于什么是多态lambda?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆