Scala 默认设置实现 [英] Scala default Set Implementation
问题描述
我可以从 Scala 文档中看到 scala.collection.immutable.Set 只是一个特征.默认情况下使用 Set 实现中的哪一个?HashSet 或 TreeSet(或其他)?
I can see that from Scala documentation scala.collection.immutable.Set is only a trait. Which one on the Set implementation is used by default ? HashSet or TreeSet (or something else) ?
我想了解/规划某些功能的运行时间.
I would like to know/plan the running time of certain functions.
示例:
scala> val s = Set(1,3,6,2,7,1)
res0: scala.collection.immutable.Set[Int] = Set(1, 6, 2, 7, 3)
s.find(5)、O(1) 或 O(log(n)) 的运行时间是多少?
What would be the running time of s.find(5), O(1) or O(log(n)) ?
既然同样适用于 Map,那么解决这个问题的最佳方法是什么?
Since same apply for Map, what is the best way to figure this out ?
推荐答案
通过查看源码,可以发现set最多四个元素都有
EmptySet
提供的优化实现,Set1
、Set2
、Set3
和Set4
,它们只保存单个值.By looking at the source code, you can find that sets up to four elements have an optimized implementation provided by
EmptySet
,Set1
,Set2
,Set3
andSet4
, which simply hold the single values.例如这里是
Set2
声明(从 Scala 2.11.4 开始):For example here's
Set2
declaration (as of scala 2.11.4):class Set2[A] private[collection] (elem1: A, elem2: A) extends AbstractSet[A] with Set[A] with Serializable
这里是
contains
实现:def contains(elem: A): Boolean = elem == elem1 || elem == elem2
或
find
实现override def find(f: A => Boolean): Option[A] = { if (f(elem1)) Some(elem1) else if (f(elem2)) Some(elem2) else None }
非常简单.
对于超过 4 个元素的集合,底层实现是一个
HashSet
.我们可以在 REPL 中轻松验证这一点:For sets with more than 4 elements, the underlying implementation is an
HashSet
. We can easily verify this in the REPL:scala> Set(1, 2, 3, 4).getClass res1: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set4 scala> Set(1, 2, 3, 4, 5, 6).getClass res0: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.HashSet$HashTrieSet
话虽如此,
find
必须始终遍历整个HashSet
,因为它是未排序的,所以它是O(n)
.相反,像contains
这样的查找操作将是O(1)
.That being said,
find
must always iterate over the wholeHashSet
, since it's unsorted, so it will beO(n)
. Conversely, a lookup operation likecontains
will beO(1)
instead.Here's a more in-depth reference about performance of scala collections in general.
说到
Map
,几乎相同的概念适用.有优化的Map
实现,最多 4 个元素,然后是HashMap
.Speaking of
Map
, pretty much the same concepts apply. There are optimizedMap
implementations up to 4 elements, and then it's anHashMap
.这篇关于Scala 默认设置实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!