Scala集合如何能够从映射操作返回正确的集合类型? [英] How are Scala collections able to return the correct collection type from a map operation?

查看:90
本文介绍了Scala集合如何能够从映射操作返回正确的集合类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:这是一个常见问题,特别要求我可以自己回答,因为这个问题似乎经常出现,我想把它放在它可以(希望)可以很容易找到搜索



根据我的在这里回答






 abcdemap {_.toUpperCase} //返回String 
abcdemap {_。 toInt} //返回一个IndexedSeq [Int]
BitSet(1,2,3,4)map {2 *} //返回一个BitSet
BitSet(1,2,3,4)map { _.toString} //返回一个Set [String]



查看scaladoc,从 TraversableLike 继承的 map 操作,那么它如何总是能够返回最具体的有效集合?甚至 String ,通过隐式转换提供 map

解决方案

Scala集合是聪明的东西...



集合库的内部是土地中更先进的话题之一的Scala。它涉及更高类型的类型,推理,方差,隐含和 CanBuildFrom 机制 - 所有使它令人难以置信的通用,易于使用,从用户面向角度。从API设计师的角度来理解它是不是一个轻松的任务,要由初学者采取。



另一方面,它是令人难以置信的



所以让我们开始...



随着Scala 2.8的发布,集合库被完全重写以删除重复,很多方法被移动到一个地方,以便持续的维护和添加新的收集方法将更容易,但它也使



列表为例,这继承自




  • LinearSeqOptimised

  • GenericTraversableTemplate

  • LinearSeq


  • SeqLike


  • IterableLike


  • TraversableLike




这是相当少的!那么为什么这种深层次?忽略 XxxLike traits,该层次结构中的每个层都添加一点功能,或者提供一个更优化的继承功能版本(例如,通过索引获取一个元素a Traversable 需要 drop 操作的组合,在索引序列上严重低效)。在可能的情况下,所有的功能都被推到层次结构的尽可能远的地方,最大化可以使用它和删除重复的子类的数量。



map 只是一个这样的例子。该方法在 TraversableLike 中实现(虽然 XxxLike traits只对库设计者真正存在,因此通常被认为是一个方法在 Traversable 为大多数意图和目的 - 我会来的那一部分),并被广泛继承。可以在某个子类中定义一个优化版本,但它仍然必须符合相同的签名。请考虑 map 的用法(如问题中所述):

 abcdemap {_.toUpperCase} //返回一个字符串
abcdemap {_.toInt} //返回一个IndexedSeq [Int]
BitSet )map {2 *} //返回一个BitSet
BitSet(1,2,3,4)map {_.toString} //返回一个Set [String]

在每种情况下,只要可能,输出都与输入的类型相同。当不可能时,检查输入类型的超类,直到发现 提供有效的返回类型。获得这个权利需要大量的工作,特别是当你认为 String 甚至不是一个集合,它只是可隐式转换为一个。



那么如何做呢?



问题的一半是 XxxLike traits (我说我会得到他们...),其主要功能是取一个 Repr 类型param ),以便他们知道实际操作的真正子类。例如 TraversableLike Traversable 相同,但抽象了 Repr 类型参数。这个参数然后由拼图的第二半使用;捕获要由集合转换操作使用的源集合类型,目标元素类型和目标集合类型的 CanBuildFrom 类型类。


$ b $



BitSet定义 CanBuildFrom 的隐式实例,如下所示:

  implicit def canBuildFrom:CanBuildFrom [BitSet,Int,BitSet] = bitsetCanBuildFrom 

当编译 BitSet(1,2,3,4)map {2 *} 将尝试隐式查找 CanBuildFrom [BitSet,Int,T]



这是聪明的部分...在范围中只有一个隐式匹配前两个类型参数。第一个参数是由 XxxLike trait捕获的 Repr ,第二个是元素类型,当前收集特性(例如 Traversable )。然后, map 操作也使用类型进行参数化,该类型 T 基于第三个类型参数 CanBuildFrom 隐式定位的实例。 BitSet 在这种情况下。



所以前两个类型参数 CanBuildFrom 是用于隐式查找的输入,第三个参数是用于推断的输出。



CanBuildFrom 因此匹配 BitSet Int ,因此查找将成功,推断的返回类型也将为 BitSet



当编译 BitSet(1,2,3,4)map {_.toString} 时,编译器会尝试隐式查找 CanBuildFrom [BitSet, String,T] 。对于BitSet中的隐式,这将失败,因此编译器接下来将尝试其超类 - Set - 这包含隐式:

  implicit def canBuildFrom [A]:CanBuildFrom [Coll,A,Set [A]] = setCanBuildFrom [A] 

匹配,因为Coll是一个类型别名,当 BitSet Bitset / code>派生自设置 A 将匹配任何内容,因为 canBuildFrom 参数化为 A ,在这种情况下,它推断是 String ...因此产生一个返回类型 Set [String]



为了正确实现集合类型,您不仅需要提供 CanBuildFrom ,但您还需要确保该集合的具体类型作为 Repr 参数提供给正确的父traits(例如,这将是 MapLike 在子类化 Map 的情况下。



String 稍微复杂些,因为它通过隐式转换提供 map 。隐式转换是 StringOps ,其子类 StringLike [String] ,最终导出 TraversableLike [Char,String] - String Repr 类型参数。

在范围中还有一个 CanBuildFrom [String,Char,String] ,以便编译器知道当映射 String Char s,那么返回类型也应该是字符串。从这一点开始,使用相同的机制。


Note: This is an FAQ, asked specifically so I can answer it myself, as this issue seems to come up fairly often and I want to put it in a location where it can (hopefully) be easily found via a search

As prompted by a comment on my answer here


For example:

"abcde" map {_.toUpperCase} //returns a String
"abcde" map {_.toInt} // returns an IndexedSeq[Int]
BitSet(1,2,3,4) map {2*} // returns a BitSet
BitSet(1,2,3,4) map {_.toString} // returns a Set[String]

Looking in the scaladoc, all of these use the map operation inherited from TraversableLike, so how come it's always able to return the most specific valid collection? Even String, which provides map via an implicit conversion.

解决方案

Scala collections are clever things...

Internals of the collection library is one of the more advanced topics in the land of Scala. It involves higher-kinded types, inference, variance, implicits, and the CanBuildFrom mechanism - all to make it incredibly generic, easy to use, and powerful from a user-facing perspective. Understanding it from the point-of-view of an API designer is not a light-hearted task to be taken on by a beginner.

On the other hand, it's incredibly rare that you'll ever actually need to work with collections at this depth.

So let us begin...

With the release of Scala 2.8, the collection library was completely rewritten to remove duplication, a great many methods were moved to just one place so that ongoing maintenance and the addition of new collection methods would be far easier, but it also makes the hierarchy harder to understand.

Take List for example, this inherits from (in turn)

  • LinearSeqOptimised
  • GenericTraversableTemplate
  • LinearSeq
  • Seq
  • SeqLike
  • Iterable
  • IterableLike
  • Traversable
  • TraversableLike
  • TraversableOnce

That's quite a handful! So why this deep hierarchy? Ignoring the XxxLike traits briefly, each tier in that hierarchy adds a little bit of functionality, or provides a more optimised version of inherited functionality (for example, fetching an element by index on a Traversable requires a combination of drop and head operations, grossly inefficient on an indexed sequence). Where possible, all functionality is pushed as far up the hierarchy as it can possibly go, maximising the number of subclasses that can use it and removing duplication.

map is just one such example. The method is implemented in TraversableLike (Though the XxxLike traits only really exist for library designers, so it's generally considered to be a method on Traversable for most intents and purposes - I'll come to that part shortly), and is widely inherited. It's possible to define an optimised version in some subclass, but it must still conform to the same signature. Consider the following uses of map (as also mentioned in the question):

"abcde" map {_.toUpperCase} //returns a String
"abcde" map {_.toInt} // returns an IndexedSeq[Int]
BitSet(1,2,3,4) map {2*} // returns a BitSet
BitSet(1,2,3,4) map {_.toString} // returns a Set[String]

In each case, the output is of the same type as the input wherever possible. When it's not possible, superclasses of the input type are checked until one is found that does offer a valid return type. Getting this right took a lot of work, especially when you consider that String isn't even a collection, it's just implicitly convertible to one.

So how is it done?

One half of the puzzle is the XxxLike traits (I did say I'd get to them...), whose main function is to take a Repr type param (short for "Representation") so that they'll know the true subclass actually being operated on. So e.g. TraversableLike is the same as Traversable, but abstracted over the Repr type param. This param is then used by the second half of the puzzle; the CanBuildFrom type class that captures source collection type, target element type and target collection type to be used by collection-transforming operations.

It's easier to explain with an example!

BitSet defines an implicit instance of CanBuildFrom like this:

implicit def canBuildFrom: CanBuildFrom[BitSet, Int, BitSet] = bitsetCanBuildFrom

When compiling BitSet(1,2,3,4) map {2*}, the compiler will attempt an implicit lookup of CanBuildFrom[BitSet, Int, T]

This is the clever part... There's only one implicit in scope that matches the first two type parameters. The first parameter is Repr, as captured by the XxxLike trait, and the second is the element type, as captured by the current collection trait (e.g. Traversable). The map operation is then also parameterised with a type, this type T is inferred based on the third type parameter to the CanBuildFrom instance that was implicitly located. BitSet in this case.

So the first two type parameters to CanBuildFrom are inputs, to be used for implicit lookup, and the third parameter is an output, to be used for inference.

CanBuildFrom in BitSet therefore matches the two types BitSet and Int, so the lookup will succeed, and inferred return type will also be BitSet.

When compiling BitSet(1,2,3,4) map {_.toString}, the compiler will attempt an implicit lookup of CanBuildFrom[BitSet, String, T]. This will fail for the implicit in BitSet, so the compiler will next try its superclass - Set - This contains the implicit:

implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Set[A]] = setCanBuildFrom[A]

Which matches, because Coll is a type alias that's initialised to be BitSet when BitSet derives from Set. The A will match anything, as canBuildFrom is parameterised with the type A, in this case it's inferred to be String... Thus yielding a return type of Set[String].

So to correctly implement a collection type, you not only need to provide a correct implicit of type CanBuildFrom, but you also need to ensure that the concrete type of that of that collection is supplied as the Repr param to the correct parent traits (for example, this would be MapLike in the case of subclassing Map).

String is a little more complicated as it provides map by an implicit conversion. The implicit conversion is to StringOps, which subclasses StringLike[String], which ultimately derives TraversableLike[Char,String] - String being the Repr type param.

There's also a CanBuildFrom[String,Char,String] in scope so that the compiler knows that when mapping the elements of a String to Chars, then the return type should also be a string. From this point onwards, the same mechanism is used.

这篇关于Scala集合如何能够从映射操作返回正确的集合类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆