了解 GenericTraversableTemplate 和其他 Scala 集合内部结构 [英] Understanding GenericTraversableTemplate and other Scala collection internals

查看:38
本文介绍了了解 GenericTraversableTemplate 和其他 Scala 集合内部结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在与一个非常喜欢 Kotlin、Clojure 和 Java8 的熟人交换电子邮件,并问他为什么不使用 Scala.他提供了很多理由(Scala 太学术,功能太多,我不是第一次听到这个,我认为这是非常主观的)但是他最大的痛点是举个例子,他不喜欢一门看不懂基本数据结构实现的语言,他举了LinkedList为例.

I was exchanging emails with an acquaintance that is a big Kotlin, Clojure and Java8 fan and asked him why not Scala. He provided many reasons (Scala is too academic, too many features, not the first time I hear this and I think this is very subjective) but his biggest pain point was as an example, that he doesn't like a language where he can't understand the implementation of basic data structures, and he gave LinkedList as an example.

我查看了 scala.collection.LinkedList 并计算了我理解或有点理解的东西.

I took a look at scala.collection.LinkedList and counted the things I either understand or somewhat understand.

  • CanBuildFrom - 经过一番努力,我明白了,输入类,而不是最长的遗书历史上[1]
  • LinkedListLike - 我不记得我在哪里读过它,但我确信这是有充分理由的

但后来我开始盯着这些

  • GenericTraversableTemplate - 现在我也在挠头......
  • SeqFactory、GenericCompanion - 好吧,现在你失去了我,我开始理解他的观点

理解这一点的人可以在LinkedList的上下文中解释GenericTraversableTemplate SeqFactoryGenericCompanion吗?它们的用途是什么,它们对最终用户有什么影响(例如,我确定它们存在的理由很充分,原因是什么?)

Can someone who understand this well please explain GenericTraversableTemplate SeqFactory and GenericCompanion in the context of LinkedList? What they are for, what impact on the end user they have (e.g. I'm sure they are there for a good reason, what is that reason?)

他们在那里是出于实际原因吗?还是可以简化的抽象级别?

Are they there for a practical reason? or is it a level of abstraction that could have been simplified?

我喜欢 Scala 集合,因为我不必了解内部结构就能有效地使用它们.我不介意复杂的实现,如果它有助于我保持使用更简单.例如如果我有能力使用它来编写更简洁、更优雅的代码,我不介意为一个复杂的库付出代价.但如果能更好地理解它肯定会很好.

I like Scala collections because I don't have to understand the internals to be able to effectively use them. I don't mind a complex implementation if it helps me to keep my usage simpler. e.g. I don't mind paying the price of a complex library if I get the ability to write cleaner more elegant code using it in return. but it will sure be nice to better understand it.

[1] - Scala 2.8 收藏库是历史上最长的遗书"吗?

推荐答案

我将尝试从一个随机行人的角度来描述这些概念(我从来没有向 Scala 集合库贡献过一行代码,所以不要如果我错了,不要打我太狠).

I will try to describe the concepts from the point of view of a random pedestrian (I've never contributed a single line to the Scala collection library, so don't hit me too hard if I'm wrong).

由于 LinkedList 现在已弃用,并且因为 Maps 提供了更好的示例,所以我将使用 TreeMap 作为示例.

Since LinkedList is now deprecated, and because Maps provide a better example, I will use TreeMap as example.

CanBuildFrom

动机是这样的:如果我们取一个 TreeMap[Int, Int] 并用

The motivation is this: If we take a TreeMap[Int, Int] and map it with

case (x, y) => (2 * x, y * y * 0.3d)

我们得到 TreeMap[Int, Double].仅这种类型的安全性就已经解释了简单的 genericBuilder[X] 结构.但是,如果我们将其映射为

we get TreeMap[Int, Double]. This type safety alone would already explain the necessity for simple genericBuilder[X] constructs. However, if we map it with

case (x, y) => x

我们得到一个Iterable[Int](更准确的说:一个List[Int]),这不再是一个Map,容器的类型发生了变化.这就是 CBF 发挥作用的地方:

we obtain an Iterable[Int] (more precisely: a List[Int]), this is no longer a Map, the type of the container has changed. This is where CBF's come into play:

CanBuildFrom[This, X, That]

可以看作是一种类型级函数",它告诉我们:如果我们映射一个类型的集合这带有一个返回 X 类型值的函数,我们可以构建一个 That.最具体的CBF 是在编译时提供的,在第一种情况下,它将类似于

can be seen as a kind of "type-level function" that tells us: if we map a collection of type This with a function that returns values of type X, we can build a That. The most specific CBF is provided at compile time, in the first case it will be something like

CanBuildFrom[TreeMap[_,_], (X,Y), TreeMap[X,Y]]

在第二种情况下,它将类似于

in the second case it will be something like

CanBuildFrom[TreeMap[_,_], X, Iterable[X]]

所以我们总是得到正确类型的容器.图案很一般.每次你有一个泛型函数

and so we always get the right type of the container. The pattern is pretty general. Every time you have a generic function

foo[X1, ..., Xn](x1: X1, ..., xn: Xn): Y 

其中结果 type Y 取决于 X1, ..., Xn,您可以引入一个隐式参数作为如下:

where the result type Y depends on X1, ..., Xn, you can introduce an implicit parameter as follows:

foo[X1, ...., Xn, Y](x1: X1, ..., xn: Xn)(implicit CanFooFrom[X1, ..., Xn, Y]): Y

然后通过提供多个分段定义类型级函数 X1, ..., Xn -> Y隐式 CanFooFrom 的.

and then define the type-level function X1, ..., Xn -> Y piecewise by providing multiple implicit CanFooFrom's.

LinkedListLike

在类定义中,我们看到如下内容:

In the class definition, we see something like this:

TreeMap[A, B] extends SortedMap[A, B] with SortedMapLike[A, B, TreeMap[A, B]]

这是 Scala 表达所谓的 F 有界多态性的方式.动机如下:假设我们有十几个(或至少两个……)特征 SortedMap[A, B] 的实现.现在我们要实现一个方法withoutHead,它看起来有点像这样:

This is Scala's way to express the so-called F-bounded polymorphism. The motivation is as follows: Suppose we have a dozen (or at least two...) implementations of the trait SortedMap[A, B]. Now we want to implement a method withoutHead, it could look somewhat like this:

def withoutHead = this.remove(this.head)

如果我们将实现移到 SortedMap[A, B] 本身,我们能做的最好的事情就是:

If we move the implementation into SortedMap[A, B] itself, the best we can do is this:

def withoutHead: SortedMap[A, B] = this.remove(this.head)

但这是我们能得到的最具体的结果类型吗?不,这太模糊了.如果原始地图是 TreeMap,我们希望返回 TreeMap[A, B],并且CrazySortedLinkedHashMap(或其他……)如果原始是 CrazySortedLinkedHashMap.这就是为什么我们将实现移到 SortedMapLike 中,并为 withoutHead 方法提供以下签名:

But is this the most specific result type we can get? No, that's too vague. We would like to return TreeMap[A, B] if the original map is a TreeMap, and CrazySortedLinkedHashMap (or whatever...) if the original was a CrazySortedLinkedHashMap. This is why we move the implementation into SortedMapLike, and give the following signature to the withoutHead method:

trait SortedMapLike[A, B, Repr <: SortedMap[A, B]] {
  ...
  def withoutHead: Repr = this.remove(this.head)
}

现在因为 TreeMap[A, B] 扩展了 SortedMapLike[A, B, TreeMap[A, B]],结果类型为withoutHeadTreeMap[A,B].CrazySortedLinkedHashMap 也是如此:我们得到了确切的类型.在 Java 中,您必须返回 SortedMap[A, B] 或覆盖每个子类中的方法(这对于 Scala 中功能丰富的 trait 来说是一个维护噩梦)

now because TreeMap[A, B] extends SortedMapLike[A, B, TreeMap[A, B]], the result type of withoutHead is TreeMap[A,B]. The same holds for CrazySortedLinkedHashMap: we get the exact type back. In Java, you would either have to return SortedMap[A, B] or override the method in each subclass (which turned out to be a maintenance nightmare for the feature-rich traits in Scala)

GenericTraversableTemplate

类型为:GenericTraversableTemplate[+A, +CC[X] <: GenTraversable[X]]

据我所知,这只是一个提供实现的特性以某种方式返回具有相同容器类型的常规集合的方法,但可能不同的内容类型(例如flattentransposeunzip).

As far as i can tell, this is just a trait that provides implementations of methods that somehow return regular collections with same container type but possibly different content type (stuff like flatten, transpose, unzip).

foldLeftreduceexists 之类的东西不在这里,因为这些方法只关心内容类型,而不关心容器类型.

Stuff like foldLeft, reduce, exists are not here because these methods care only about content type, not container type.

flatMap 之类的东西不在这里,因为容器类型可以改变(同样是 CBF 的).

Stuff like flatMap is not here, because the container type can change (again, CBF's).

为什么它是一个单独的特征,它存在的根本原因是什么?我不这么认为......可能有可能以不同的方式对成千上万的方法进行分组.但这只是自然发生的事情:你开始实现一个 trait,结果发现它有很多方法.因此,相反,您将松散相关的方法分组,并将它们放入 10 个不同的特征中,名称笨拙,例如GenTraversableTemplate",然后将它们全部混合到您需要的特征/类中...

Why is it a separate trait, is there a fundamental reason why it exists? I don't think so... It probably would be possible to group the godzillion of methods somewhat differently. But this is just what happens naturally: you start to implement a trait, and it turns out that it has very many methods. So instead you group loosely related methods, and put them into 10 different traits with awkward names like "GenTraversableTemplate", and them mix them all into traits/classes where you need them...

通用伴侣

这只是一个抽象类,实现了一些常见的基本功能对于大多数集合类的伴随对象(本质上,它只是实现了非常简单的工厂方法 apply(varargs)empty).

This is just an abstract class that implements some basic functionality which is common for companion objects of most collection classes (essentially, it just implements very simple factory methods apply(varargs) and empty).

例如,有一个方法 apply 接受某种类型 A 的 varargs 并返回一个类型为 CC[A] 的集合:

For example there is method apply that takes varargs of some type A and returns a collection of type CC[A]:

Array(1, 2, 3, 4) // calls Array.apply[A](elems: A*) on the companion object
List(1, 2, 3, 4) // same for List

实现很简单,就是这样的:

The implementation is very simple, it's something like this:

def apply[A](varargs: A*): CC[A] = {
  val builder = newBuilder[A]
  for (arg <- varargs) builder += arg
  builder.result()
}

对于数组、列表和树映射以及几乎所有其他内容,这显然是相同的,除了像 Bitset 这样的受约束的不规则集合".所以这只是大多数伴随对象的共同祖先类中的常见功能.没什么特别的.

This is obviously the same for Arrays and Lists and TreeMaps and almost everything else, except 'constrained irregular Collections' like Bitset. So this is just common functionality in a common ancestor class of most companion objects. Nothing special about that.

SeqFactory

类似于 GenericCompanion,但这次更专门用于序列.添加了一些常见的工厂方法,如 fill()iterate()tabulate() 等.同样,这里没有什么特别的火箭科学......

Similar to GenericCompanion, but this time more specifically for Sequences. Adds some common factory methods like fill() and iterate() and tabulate() etc. Again, nothing particularly rocket-scientific here...

一些一般性评论

总的来说:我不认为人们应该试图理解这个库中的每一个特征.相反,人们应该尝试将图书馆视为一个整体.总的来说,它有一个非常有趣的架构.而且在我个人看来,它实际上是一款非常美观的软件,但必须盯着它看很长时间(并尝试多次重新实现整个架构模式)才能掌握它.另一方面:例如,CBF 是一种设计模式",显然应该在这种语言的后继者中消除.整个故事以及隐含 CBF 的范围对我来说仍然像是一场噩梦.但是很多事情一开始似乎完全不可理解,而且几乎总是以顿悟结束(这对于 Scala 来说非常特殊:对于大多数其他语言,这种斗争通常以作者是个彻头彻尾的白痴"的想法结束).

In general: I don't think that one should attempt to understand every single trait in this library. Rather, one should try to look at the library as a whole. As a whole, it has a very interesting architecture. And in my personal opinion, it's actually a very aesthetic piece of software, but one has to stare at it for quite a while (and try to re-implement the whole architectural pattern several times) to grasp it. On the other hand: for example CBF's are kind of "design pattern" that clearly should be eliminated in successors of this language. The whole story with the scope of implicit CBF's still seems like a total nightmare to me. But many things seemed completely inscrutable at first, and almost always, it ended with an epiphany (which is very specific for Scala: for the majority of other languages, such struggles usually end with the thought "Author of this is a complete idiot").

这篇关于了解 GenericTraversableTemplate 和其他 Scala 集合内部结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆