如何确保在map()期间保留自定义Scala集合的动态类型? [英] How can I ensure that the dynamic type of my custom Scala collection is preserved during a map()?
问题描述
我阅读了关于以下内容的非常有趣的文章Scala 2.8集合,而我一直在进行一些实验.首先,我只是复制了很好的RNA
示例的最终代码.这里供参考:
I read the very interesting article on the architecture of the Scala 2.8 collections and I've been experimenting with it a little bit. For a start, I simply copied the final code for the nice RNA
example. Here it is for reference:
abstract class Base
case object A extends Base
case object T extends Base
case object G extends Base
case object U extends Base
object Base {
val fromInt: Int => Base = Array(A, T, G, U)
val toInt: Base => Int = Map(A -> 0, T -> 1, G -> 2, U -> 3)
}
final class RNA private (val groups: Array[Int], val length: Int)
extends IndexedSeq[Base] with IndexedSeqLike[Base, RNA] {
import RNA._
// Mandatory re-implementation of `newBuilder` in `IndexedSeq`
override protected[this] def newBuilder: Builder[Base, RNA] =
RNA.newBuilder
// Mandatory implementation of `apply` in `IndexedSeq`
def apply(idx: Int): Base = {
if (idx < 0 || length <= idx)
throw new IndexOutOfBoundsException
Base.fromInt(groups(idx / N) >> (idx % N * S) & M)
}
// Optional re-implementation of foreach,
// to make it more efficient.
override def foreach[U](f: Base => U): Unit = {
var i = 0
var b = 0
while (i < length) {
b = if (i % N == 0) groups(i / N) else b >>> S
f(Base.fromInt(b & M))
i += 1
}
}
}
object RNA {
private val S = 2 // number of bits in group
private val M = (1 << S) - 1 // bitmask to isolate a group
private val N = 32 / S // number of groups in an Int
def fromSeq(buf: Seq[Base]): RNA = {
val groups = new Array[Int]((buf.length + N - 1) / N)
for (i <- 0 until buf.length)
groups(i / N) |= Base.toInt(buf(i)) << (i % N * S)
new RNA(groups, buf.length)
}
def apply(bases: Base*) = fromSeq(bases)
def newBuilder: Builder[Base, RNA] =
new ArrayBuffer mapResult fromSeq
implicit def canBuildFrom: CanBuildFrom[RNA, Base, RNA] =
new CanBuildFrom[RNA, Base, RNA] {
def apply(): Builder[Base, RNA] = newBuilder
def apply(from: RNA): Builder[Base, RNA] = newBuilder
}
}
现在,这是我的问题.如果运行此命令,一切正常:
Now, here's my problem. If I run this, everything's fine:
val rna = RNA(A, G, T, U)
println(rna.map(e => e)) // prints RNA(A, G, T, U)
但是此代码将RNA转换为载体!
but this code transforms the RNA to a Vector!
val rna: IndexedSeq[Base] = RNA(A, G, T, U)
println(rna.map(e => e)) // prints Vector(A, G, T, U)
这是一个问题,因为不了解RNA
类的客户端代码可能会将其转换回Vector
,而不是仅从Base
映射到Base
时.为什么会这样,并且有什么解决方法?
This is a problem, as client code unaware of the RNA
class may transform it back to a Vector
instead when only mapping from Base
to Base
. Why is that so, and what are the ways to fix it?
P.-S .:我找到了一个初步的答案(请参阅下文),如果我错了,请纠正我.
P.-S.: I've found a tentative answer (see below), please correct me if I'm wrong.
推荐答案
如果rna
变量的静态类型为IndexedSeq[Base]
,则自动插入的CanBuildFrom
不能是RNA
随播对象中定义的变量,因为编译器不应该知道rna
是RNA
的实例.
If the static type of the rna
variable is IndexedSeq[Base]
, the automatically inserted CanBuildFrom
cannot be the one defined in the RNA
companion object, as the compiler is not supposed to know that rna
is an instance of RNA
.
那么它从哪里来?编译器依靠GenericCanBuildFrom
的实例,该实例在IndexedSeq
对象中定义. GenericCanBuildFrom
通过在原始集合上调用genericBuilder[B]
来生成其生成器,并且对该通用生成器的要求是它可以生成可以容纳任何类型B
的通用集合-当然,该函数的返回类型传递给map()
不受限制.
So where does it come from? The compiler falls back on an instance of GenericCanBuildFrom
, the one defined in the IndexedSeq
object. GenericCanBuildFrom
s produce their builders by calling genericBuilder[B]
on the originating collection, and a requirement for that generic builder is that it can produce generic collections that can hold any type B
— as of course, the return type of the function passed to a map()
is not constrained.
在这种情况下,RNA
只是一个IndexedSeq[Base]
,而不是通用的IndexedSeq
,因此无法覆盖RNA
中的genericBuilder[B]
以返回特定于RNA
的构建器-我们将拥有在运行时检查B
是Base
还是其他,但是我们不能这样做.
In this case, RNA
is only an IndexedSeq[Base]
and not a generic IndexedSeq
, so it's not possible to override genericBuilder[B]
in RNA
to return a RNA
-specific builder — we would have to check at runtime whether B
is Base
or something else, but we cannot do that.
我认为这解释了为什么,在问题中,我们得到了Vector
的回信.至于我们如何解决它,这是一个悬而未决的问题……
I think this explains why, in the question, we get a Vector
back. As to how we can fix it, it's an open question…
编辑:要解决此问题,需要map()
知道它是否映射到A
的子类型.为此,需要对馆藏库进行重大更改.请参阅相关问题 Scala的map()是否应表现映射到相同类型时是否有所不同?.
Edit: Fixing this requires map()
to know whether it's mapping to a subtype of A
or not. A significant change in the collections library would be needed for this to happen. See the related question Should Scala's map() behave differently when mapping to the same type?.
这篇关于如何确保在map()期间保留自定义Scala集合的动态类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!