当映射到相同类型时,Scala的map()是否应表现出不同的行为? [英] Should Scala's map() behave differently when mapping to the same type?
问题描述
在Scala Collections框架中,我认为使用map()
时有一些违反直觉的行为.
In the Scala Collections framework, I think there are some behaviors that are counterintuitive when using map()
.
我们可以区分(不可变)集合的两种转换.那些实现调用newBuilder
来重新创建结果集合的人,以及那些通过隐式CanBuildFrom
来获得构建器的人.
We can distinguish two kinds of transformations on (immutable) collections. Those whose implementation calls newBuilder
to recreate the resulting collection, and those who go though an implicit CanBuildFrom
to obtain the builder.
第一个类别包含所有包含元素的类型不变的转换.例如,它们是filter
,partition
,drop
,take
,span
等.这些转换可以自由调用newBuilder
并重新创建与被调用类型相同的集合类型.启用,无论多么具体:过滤List[Int]
总是可以返回List[Int]
;过滤BitSet
(或
The first category contains all transformations where the type of the contained elements does not change. They are, for example, filter
, partition
, drop
, take
, span
, etc. These transformations are free to call newBuilder
and to recreate the same collection type as the one they are called on, no matter how specific: filtering a List[Int]
can always return a List[Int]
; filtering a BitSet
(or the RNA
example structure described in this article on the architecture of the collections framework) can always return another BitSet
(or RNA
). Let's call them the filtering transformations.
第二类转换需要CanBuildFrom
更加灵活,因为所包含元素的类型可能会发生变化,因此,集合本身的类型可能无法重用:BitSet
不能包含String
; RNA
仅包含Base
.此类转换的示例是map
,flatMap
,collect
,scanLeft
,++
等.让我们将其称为映射转换.
The second category of transformations need CanBuildFrom
s to be more flexible, as the type of the contained elements may change, and as a result of this, the type of the collection itself maybe cannot be reused: a BitSet
cannot contain String
s; an RNA
contains only Base
s. Examples of such transformations are map
, flatMap
, collect
, scanLeft
, ++
, etc. Let's call them the mapping transformations.
现在是这里要讨论的主要问题.无论集合的静态类型是什么,所有过滤转换都将返回相同的集合类型,而映射操作返回的集合类型可能会根据静态类型而有所不同.
Now here's the main issue to discuss. No matter what the static type of the collection is, all filtering transformations will return the same collection type, while the collection type returned by a mapping operation can vary depending on the static type.
scala> import collection.immutable.TreeSet
import collection.immutable.TreeSet
scala> val treeset = TreeSet(1,2,3,4,5) // static type == dynamic type
treeset: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 2, 3, 4, 5)
scala> val set: Set[Int] = TreeSet(1,2,3,4,5) // static type != dynamic type
set: Set[Int] = TreeSet(1, 2, 3, 4, 5)
scala> treeset.filter(_ % 2 == 0)
res0: scala.collection.immutable.TreeSet[Int] = TreeSet(2, 4) // fine, a TreeSet again
scala> set.filter(_ % 2 == 0)
res1: scala.collection.immutable.Set[Int] = TreeSet(2, 4) // fine
scala> treeset.map(_ + 1)
res2: scala.collection.immutable.SortedSet[Int] = TreeSet(2, 3, 4, 5, 6) // still fine
scala> set.map(_ + 1)
res3: scala.collection.immutable.Set[Int] = Set(4, 5, 6, 2, 3) // uh?!
现在,我明白了为什么这样做会如此.解释有和有一个>.简而言之:隐式CanBuildFrom
是基于静态类型插入的,并且取决于其def apply(from: Coll)
方法的实现,可能会或可能无法重新创建相同的集合类型.
Now, I understand why this works like this. It is explained there and there. In short: the implicit CanBuildFrom
is inserted based on the static type, and, depending on the implementation of its def apply(from: Coll)
method, may or may not be able to recreate the same collection type.
现在我唯一要说的是,当我们知道我们正在使用映射操作生成具有相同元素类型(编译器可以静态确定)的集合时,我们可以模仿过滤的方式转换工作并使用集合的本机生成器.当映射到Int
时,我们可以重用BitSet
,以相同的顺序创建一个新的TreeSet
,等等.
Now my only point is, when we know that we are using a mapping operation yielding a collection with the same element type (which the compiler can statically determine), we could mimic the way the filtering transformations work and use the collection's native builder. We can reuse BitSet
when mapping to Int
s, create a new TreeSet
with the same ordering, etc.
那么我们将避免出现以下情况
Then we would avoid cases where
for (i <- set) {
val x = i + 1
println(x)
}
不会以与TreeSet
的递增元素
does not print the incremented elements of the TreeSet
in the same order as
for (i <- set; x = i + 1)
println(x)
所以:
- 您认为这是改变所述映射转换行为的好主意吗?
- 我完全忽略了哪些不可避免的警告?
- 如何实施?
我正在考虑类似implicit sameTypeEvidence: A =:= B
参数的事情,可能具有默认值null
(或更确切地说是implicit canReuseCalleeBuilderEvidence: B <:< A = null
),该参数可以在运行时用于为CanBuildFrom
提供更多信息,反过来可以用来确定要返回的构建器的类型.
I was thinking about something like an implicit sameTypeEvidence: A =:= B
parameter, maybe with a default value of null
(or rather an implicit canReuseCalleeBuilderEvidence: B <:< A = null
), which could be used at runtime to give more information to the CanBuildFrom
, which in turn could be used to determine the type of builder to return.
推荐答案
我再次查看了它,我认为您的问题不是由Scala集合的特定缺陷引起的,而是由TreeSet
的缺少的构建器引起的.因为以下内容确实可以正常工作:
I looked again at it, and I think your problem doesn't arise from a particular deficiency of Scala collections, but rather a missing builder for TreeSet
. Because the following does work as intended:
val list = List(1,2,3,4,5)
val seq1: Seq[Int] = list
seq1.map( _ + 1 ) // yields List
val vector = Vector(1,2,3,4,5)
val seq2: Seq[Int] = vector
seq2.map( _ + 1 ) // yields Vector
所以原因是TreeSet
缺少专门的伴随对象/生成器:
So the reason is that TreeSet
is missing a specialised companion object/builder:
seq1.companion.newBuilder[Int] // ListBuffer
seq2.companion.newBuilder[Int] // VectorBuilder
treeset.companion.newBuilder[Int] // Set (oops!)
所以我的猜测是,如果您为RNA
类的此类同伴提供了适当的准备,您可能会发现map
和filter
都可以按照您的意愿工作...?
So my guess is, if you take proper provision for such a companion for your RNA
class, you may find that both map
and filter
work as you wish...?
这篇关于当映射到相同类型时,Scala的map()是否应表现出不同的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!