如何确保在map()期间保留自定义Scala集合的动态类型? [英] How can I ensure that the dynamic type of my custom Scala collection is preserved during a map()?

查看:153
本文介绍了如何确保在map()期间保留自定义Scala集合的动态类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我阅读了关于以下内容的非常有趣的文章Scala 2.8集合,而我一直在进行一些实验.首先,我只是复制了很好的RNA示例的最终代码.这里供参考:

I read the very interesting article on the architecture of the Scala 2.8 collections and I've been experimenting with it a little bit. For a start, I simply copied the final code for the nice RNA example. Here it is for reference:

abstract class Base
case object A extends Base
case object T extends Base
case object G extends Base
case object U extends Base

object Base {
  val fromInt: Int => Base = Array(A, T, G, U)
  val toInt: Base => Int = Map(A -> 0, T -> 1, G -> 2, U -> 3)
}

final class RNA private (val groups: Array[Int], val length: Int)
    extends IndexedSeq[Base] with IndexedSeqLike[Base, RNA] {

  import RNA._

  // Mandatory re-implementation of `newBuilder` in `IndexedSeq`
  override protected[this] def newBuilder: Builder[Base, RNA] =
    RNA.newBuilder

  // Mandatory implementation of `apply` in `IndexedSeq`
  def apply(idx: Int): Base = {
    if (idx < 0 || length <= idx)
      throw new IndexOutOfBoundsException
    Base.fromInt(groups(idx / N) >> (idx % N * S) & M)
  }

  // Optional re-implementation of foreach, 
  // to make it more efficient.
  override def foreach[U](f: Base => U): Unit = {
    var i = 0
    var b = 0
    while (i < length) {
      b = if (i % N == 0) groups(i / N) else b >>> S
      f(Base.fromInt(b & M))
      i += 1
    }
  }
}

object RNA {

  private val S = 2 // number of bits in group
  private val M = (1 << S) - 1 // bitmask to isolate a group
  private val N = 32 / S // number of groups in an Int

  def fromSeq(buf: Seq[Base]): RNA = {
    val groups = new Array[Int]((buf.length + N - 1) / N)
    for (i <- 0 until buf.length)
      groups(i / N) |= Base.toInt(buf(i)) << (i % N * S)
    new RNA(groups, buf.length)
  }

  def apply(bases: Base*) = fromSeq(bases)

  def newBuilder: Builder[Base, RNA] =
    new ArrayBuffer mapResult fromSeq

  implicit def canBuildFrom: CanBuildFrom[RNA, Base, RNA] =
    new CanBuildFrom[RNA, Base, RNA] {
      def apply(): Builder[Base, RNA] = newBuilder
      def apply(from: RNA): Builder[Base, RNA] = newBuilder
    }
}

现在,这是我的问题.如果运行此命令,一切正常:

Now, here's my problem. If I run this, everything's fine:

val rna = RNA(A, G, T, U)
println(rna.map(e => e)) // prints RNA(A, G, T, U)

但是此代码将RNA转换为载体!

but this code transforms the RNA to a Vector!

val rna: IndexedSeq[Base] = RNA(A, G, T, U)
println(rna.map(e => e)) // prints Vector(A, G, T, U)

这是一个问题,因为不了解RNA类的客户端代码可能会将其转换回Vector,而不是仅从Base映射到Base时.为什么会这样,并且有什么解决方法?

This is a problem, as client code unaware of the RNA class may transform it back to a Vector instead when only mapping from Base to Base. Why is that so, and what are the ways to fix it?

P.-S .:我找到了一个初步的答案(请参阅下文),如果我错了,请纠正我.

P.-S.: I've found a tentative answer (see below), please correct me if I'm wrong.

推荐答案

如果rna变量的静态类型为IndexedSeq[Base],则自动插入的CanBuildFrom不能是RNA随播对象中定义的变量,因为编译器不应该知道rnaRNA的实例.

If the static type of the rna variable is IndexedSeq[Base], the automatically inserted CanBuildFrom cannot be the one defined in the RNA companion object, as the compiler is not supposed to know that rna is an instance of RNA.

那么它从哪里来?编译器依靠GenericCanBuildFrom的实例,该实例在IndexedSeq对象中定义. GenericCanBuildFrom通过在原始集合上调用genericBuilder[B]来生成其生成器,并且对该通用生成器的要求是它可以生成可以容纳任何类型B的通用集合-当然,该函数的返回类型传递给map()不受限制.

So where does it come from? The compiler falls back on an instance of GenericCanBuildFrom, the one defined in the IndexedSeq object. GenericCanBuildFroms produce their builders by calling genericBuilder[B] on the originating collection, and a requirement for that generic builder is that it can produce generic collections that can hold any type B — as of course, the return type of the function passed to a map() is not constrained.

在这种情况下,RNA只是一个IndexedSeq[Base],而不是通用的IndexedSeq,因此无法覆盖RNA中的genericBuilder[B]以返回特定于RNA的构建器-我们将拥有在运行时检查BBase还是其他,但是我们不能这样做.

In this case, RNA is only an IndexedSeq[Base] and not a generic IndexedSeq, so it's not possible to override genericBuilder[B] in RNA to return a RNA-specific builder — we would have to check at runtime whether B is Base or something else, but we cannot do that.

我认为这解释了为什么,在问题中,我们得到了Vector的回信.至于我们如何解决它,这是一个悬而未决的问题……

I think this explains why, in the question, we get a Vector back. As to how we can fix it, it's an open question…

编辑:要解决此问题,需要map()知道它是否映射到A的子类型.为此,需要对馆藏库进行重大更改.请参阅相关问题 Scala的map()是否应表现映射到相同类型时是否有所不同?.

Edit: Fixing this requires map() to know whether it's mapping to a subtype of A or not. A significant change in the collections library would be needed for this to happen. See the related question Should Scala's map() behave differently when mapping to the same type?.

这篇关于如何确保在map()期间保留自定义Scala集合的动态类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆