当映射到相同类型时,Scala的map()是否应表现出不同的行为? [英] Should Scala's map() behave differently when mapping to the same type?

查看:64
本文介绍了当映射到相同类型时,Scala的map()是否应表现出不同的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Scala Collections框架中,我认为使用map()时有一些违反直觉的行为.

In the Scala Collections framework, I think there are some behaviors that are counterintuitive when using map().

我们可以区分(不可变)集合的两种转换.那些实现调用newBuilder来重新创建结果集合的人,以及那些通过隐式CanBuildFrom来获得构建器的人.

We can distinguish two kinds of transformations on (immutable) collections. Those whose implementation calls newBuilder to recreate the resulting collection, and those who go though an implicit CanBuildFrom to obtain the builder.

第一个类别包含所有包含元素的类型不变的转换.例如,它们是filterpartitiondroptakespan等.这些转换可以自由调用newBuilder并重新创建与被调用类型相同的集合类型.启用,无论多么具体:过滤List[Int]总是可以返回List[Int];过滤BitSet(或

The first category contains all transformations where the type of the contained elements does not change. They are, for example, filter, partition, drop, take, span, etc. These transformations are free to call newBuilder and to recreate the same collection type as the one they are called on, no matter how specific: filtering a List[Int] can always return a List[Int]; filtering a BitSet (or the RNA example structure described in this article on the architecture of the collections framework) can always return another BitSet (or RNA). Let's call them the filtering transformations.

第二类转换需要CanBuildFrom更加灵活,因为所包含元素的类型可能会发生变化,因此,集合本身的类型可能无法重用:BitSet不能包含StringRNA仅包含Base.此类转换的示例是mapflatMapcollectscanLeft++等.让我们将其称为映射转换.

The second category of transformations need CanBuildFroms to be more flexible, as the type of the contained elements may change, and as a result of this, the type of the collection itself maybe cannot be reused: a BitSet cannot contain Strings; an RNA contains only Bases. Examples of such transformations are map, flatMap, collect, scanLeft, ++, etc. Let's call them the mapping transformations.

现在是这里要讨论的主要问题.无论集合的静态类型是什么,所有过滤转换都将返回相同的集合类型,而映射操作返回的集合类型可能会根据静态类型而有所不同.

Now here's the main issue to discuss. No matter what the static type of the collection is, all filtering transformations will return the same collection type, while the collection type returned by a mapping operation can vary depending on the static type.

scala> import collection.immutable.TreeSet
import collection.immutable.TreeSet

scala> val treeset = TreeSet(1,2,3,4,5) // static type == dynamic type
treeset: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 2, 3, 4, 5)

scala> val set: Set[Int] = TreeSet(1,2,3,4,5) // static type != dynamic type
set: Set[Int] = TreeSet(1, 2, 3, 4, 5)

scala> treeset.filter(_ % 2 == 0)
res0: scala.collection.immutable.TreeSet[Int] = TreeSet(2, 4) // fine, a TreeSet again

scala> set.filter(_ % 2 == 0)    
res1: scala.collection.immutable.Set[Int] = TreeSet(2, 4) // fine

scala> treeset.map(_ + 1)        
res2: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 3, 4, 5, 6) // still fine

scala> set.map(_ + 1)    
res3: scala.collection.immutable.Set[Int] = Set(4, 5, 6, 2, 3) // uh?!

现在,我明白了为什么这样做会如此.解释.简而言之:隐式CanBuildFrom是基于静态类型插入的,并且取决于其def apply(from: Coll)方法的实现,可能会或可能无法重新创建相同的集合类型.

Now, I understand why this works like this. It is explained there and there. In short: the implicit CanBuildFrom is inserted based on the static type, and, depending on the implementation of its def apply(from: Coll) method, may or may not be able to recreate the same collection type.

现在我唯一要说的是,当我们知道我们正在使用映射操作生成具有相同元素类型(编译器可以静态确定)的集合时,我们可以模仿过滤的方式转换工作并使用集合的本机生成器.当映射到Int时,我们可以重用BitSet,以相同的顺序创建一个新的TreeSet,等等.

Now my only point is, when we know that we are using a mapping operation yielding a collection with the same element type (which the compiler can statically determine), we could mimic the way the filtering transformations work and use the collection's native builder. We can reuse BitSet when mapping to Ints, create a new TreeSet with the same ordering, etc.

那么我们将避免出现以下情况

Then we would avoid cases where

for (i <- set) {
  val x = i + 1
  println(x)
}

不会以与相同的顺序打印TreeSet的递增元素

does not print the incremented elements of the TreeSet in the same order as

for (i <- set; x = i + 1)
  println(x)

所以:

  • 您认为这是改变所述映射转换行为的好主意吗?
  • 我完全忽略了哪些不可避免的警告?
  • 如何实施?

我正在考虑类似implicit sameTypeEvidence: A =:= B参数的事情,可能具有默认值null(或更确切地说是implicit canReuseCalleeBuilderEvidence: B <:< A = null),该参数可以在运行时用于为CanBuildFrom提供更多信息,反过来可以用来确定要返回的构建器的类型.

I was thinking about something like an implicit sameTypeEvidence: A =:= B parameter, maybe with a default value of null (or rather an implicit canReuseCalleeBuilderEvidence: B <:< A = null), which could be used at runtime to give more information to the CanBuildFrom, which in turn could be used to determine the type of builder to return.

推荐答案

我再次查看了它,我认为您的问题不是由Scala集合的特定缺陷引起的,而是由TreeSet的缺少的构建器引起的.因为以下内容确实可以正常工作:

I looked again at it, and I think your problem doesn't arise from a particular deficiency of Scala collections, but rather a missing builder for TreeSet. Because the following does work as intended:

val list = List(1,2,3,4,5)
val seq1: Seq[Int] = list
seq1.map( _ + 1 ) // yields List

val vector = Vector(1,2,3,4,5)
val seq2: Seq[Int] = vector
seq2.map( _ + 1 ) // yields Vector

所以原因是TreeSet缺少专门的伴随对象/生成器:

So the reason is that TreeSet is missing a specialised companion object/builder:

seq1.companion.newBuilder[Int]    // ListBuffer
seq2.companion.newBuilder[Int]    // VectorBuilder
treeset.companion.newBuilder[Int] // Set (oops!)

所以我的猜测是,如果您为RNA类的此类同伴提供了适当的准备,您可能会发现mapfilter都可以按照您的意愿工作...?

So my guess is, if you take proper provision for such a companion for your RNA class, you may find that both map and filter work as you wish...?

这篇关于当映射到相同类型时,Scala的map()是否应表现出不同的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆