如何将enrich-my-library模式应用于Scala集合? [英] How do I apply the enrich-my-library pattern to Scala collections?

查看:375
本文介绍了如何将enrich-my-library模式应用于Scala集合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Scala中最强大的模式之一是enrich-my-library *模式,它使用隐式转换来 向现有类添加方法,而不需要动态方法解析。例如,如果我们希望所有字符串都有方法 spaces 计算它们有多少个空白字符,我们可以:

One of the most powerful patterns available in Scala is the enrich-my-library* pattern, which uses implicit conversions to appear to add methods to existing classes without requiring dynamic method resolution. For example, if we wished that all strings had the method spaces that counted how many whitespace characters they had, we could:

class SpaceCounter(s: String) {
  def spaces = s.count(_.isWhitespace)
}
implicit def string_counts_spaces(s: String) = new SpaceCounter(s)

scala> "How many spaces do I have?".spaces
res1: Int = 5

,这种模式在处理通用集合时遇到麻烦。例如,有许多问题涉及按顺序将项目与收藏集分组。没有什么内置的工作在一个镜头,所以这似乎是一个理想的候选人的enrich-my-library模式使用通用集合 C 和通用元素类型 A

Unfortunately, this pattern runs into trouble when dealing with generic collections. For example, a number of questions have been asked about grouping items sequentially with collections. There is nothing built in that works in one shot, so this seems an ideal candidate for the enrich-my-library pattern using a generic collection C and a generic element type A:

class SequentiallyGroupingCollection[A, C[A] <: Seq[A]](ca: C[A]) {
  def groupIdentical: C[C[A]] = {
    if (ca.isEmpty) C.empty[C[A]]
    else {
      val first = ca.head
      val (same,rest) = ca.span(_ == first)
      same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
    }
  }
}

无法工作。 REPL告诉我们:

except, of course, it doesn't work. The REPL tells us:

<console>:12: error: not found: value C
               if (ca.isEmpty) C.empty[C[A]]
                               ^
<console>:16: error: type mismatch;
 found   : Seq[Seq[A]]
 required: C[C[A]]
                 same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
                      ^

有两个问题:我们如何获得 C [C [A]] 从空的 C [A] 列表(或从稀薄的空气)?我们如何从 same +:行返回 C [C [A]] code> Seq [Seq [A]] ?

There are two problems: how do we get a C[C[A]] from an empty C[A] list (or from thin air)? And how do we get a C[C[A]] back from the same +: line instead of a Seq[Seq[A]]?

* 以前称为pimp-my- sup>

* Formerly known as pimp-my-library.

推荐答案

理解这个问题的关键是要认识到有两种不同的方法来构建和使用集合库中的集合。一个是公共集合接口及其所有的好方法。

The key to understanding this problem is to realize that there are two different ways to build and work with collections in the collections library. One is the public collections interface with all its nice methods. The other, which is used extensively in creating the collections library, but which are almost never used outside of it, is the builders.

我们可以在

现在的问题是:我们从哪里得到构建器?明显的地方是从收集本身。 这不起作用。我们已经决定,在移动到通用集合时,我们将忘记集合的类型。因此,即使集合可能返回一个生成器,它将生成更多的我们想要的类型的集合,它不会知道是什么类型。

Now the question is: where do we get our builders from? The obvious place is from the collection itself. This doesn't work. We already decided, in moving to a generic collection, that we were going to forget the type of the collection. So even though the collection could return a builder that would generate more collections of the type we want, it wouldn't know what the type was.

构建器从 CanBuildFrom 隐含浮动。

因此,我们有两个概念性的跨越:

So, we have two conceptual leaps to make:


  1. 我们不使用标准集合操作,我们使用构建器。

  2. code> CanBuildFrom ,而不是直接从我们的集合。

  1. We aren't using standard collections operations, we're using builders.
  2. We get these builders from implicit CanBuildFroms, not from our collection directly.

让我们看一个例子。

class GroupingCollection[A, C[A] <: Iterable[A]](ca: C[A]) {
  import collection.generic.CanBuildFrom
  def groupedWhile(p: (A,A) => Boolean)(
    implicit cbfcc: CanBuildFrom[C[A],C[A],C[C[A]]], cbfc: CanBuildFrom[C[A],A,C[A]]
  ): C[C[A]] = {
    val it = ca.iterator
    val cca = cbfcc()
    if (!it.hasNext) cca.result
    else {
      val as = cbfc()
      var olda = it.next
      as += olda
      while (it.hasNext) {
        val a = it.next
        if (p(olda,a)) as += a
        else { cca += as.result; as.clear; as += a }
        olda = a
      }
      cca += as.result
    }
    cca.result
  }
}
implicit def iterable_has_grouping[A, C[A] <: Iterable[A]](ca: C[A]) = {
  new GroupingCollection[A,C](ca)
}

让我们分开。首先,为了构建集合集合,我们知道我们需要构建两种类型的集合: C [A] 用于每个组, C [C [A]] 将所有组收集在一起。因此,我们需要两个构建器,一个构建 A s并构建 C [A] C [A] s并构建 C [C [A]] 查看 CanBuildFrom 的类型签名,我们看到

Let's take this apart. First, in order to build the collection-of-collections, we know we'll need to build two types of collections: C[A] for each group, and C[C[A]] that gathers all the groups together. Thus, we need two builders, one that takes As and builds C[A]s, and one that takes C[A]s and builds C[C[A]]s. Looking at the type signature of CanBuildFrom, we see

CanBuildFrom[-From, -Elem, +To]

这意味着CanBuildFrom想要知道集合的类型,重新开始 - 在我们的例子中,它是 C [A] ,然后生成的集合的元素和该集合的类型。因此,我们以隐式参数 cbfcc cbfc 来填充这些。

which means that CanBuildFrom wants to know the type of collection we're starting with--in our case, it's C[A], and then the elements of the generated collection and the type of that collection. So we fill those in as implicit parameters cbfcc and cbfc.

实现了这一点,这是大部分的工作。我们可以使用 CanBuildFrom 给我们建设者(所有你需要做的是应用它们)。一个构建器可以创建一个集合 + = ,将它转换为最终应该是 result ,并清空自身并准备好再次使用 clear 重新启动。构建器开始空,这解决了我们的第一个编译错误,并且由于我们使用构建器而不是递归,第二个错误也消失了。

Having realized this, that's most of the work. We can use our CanBuildFroms to give us builders (all you need to do is apply them). And one builder can build up a collection with +=, convert it to the collection it is supposed to ultimately be with result, and empty itself and be ready to start again with clear. The builders start off empty, which solves our first compile error, and since we're using builders instead of recursion, the second error also goes away.

最后一个小细节 - 除了实际工作的算法之外 - 在隐式转换中。注意,我们使用 new GroupingCollection [A,C] 不是 [A,C [A]] 。这是因为类声明是 C 有一个参数,它本身以 A 传递给它。所以我们只是把它转换为 C ,并让它创建 C [A] 。小细节,但如果你尝试另一种方式,你会得到编译时错误。

One last little detail--other than the algorithm that actually does the work--is in the implicit conversion. Note that we use new GroupingCollection[A,C] not [A,C[A]]. This is because the class declaration was for C with one parameter, which it fills it itself with the A passed to it. So we just hand it the type C, and let it create C[A] out of it. Minor detail, but you'll get compile-time errors if you try another way.

在这里,我已经使方法比元素集合 - 相反,只要对顺序元素的测试失败,该方法就会将原始集合分开。

Here, I've made the method a little bit more generic than the "equal elements" collection--rather, the method cuts the original collection apart whenever its test of sequential elements fails.

让我们看看我们的方法:

Let's see our method in action:

scala> List(1,2,2,2,3,4,4,4,5,5,1,1,1,2).groupedWhile(_ == _)
res0: List[List[Int]] = List(List(1), List(2, 2, 2), List(3), List(4, 4, 4), 
                             List(5, 5), List(1, 1, 1), List(2))

scala> Vector(1,2,3,4,1,2,3,1,2,1).groupedWhile(_ < _)
res1: scala.collection.immutable.Vector[scala.collection.immutable.Vector[Int]] =
  Vector(Vector(1, 2, 3, 4), Vector(1, 2, 3), Vector(1, 2), Vector(1))

它可以工作!

唯一的问题是,通常有这些方法可用于数组,因为这将需要在一行中的两个隐式转换。有几种方法可以解决这个问题,包括为数组编写一个单独的隐式转换,转换为 WrappedArray 等等。

The only problem is that we don't in general have these methods available for arrays, since that would require two implicit conversions in a row. There are several ways to get around this, including writing a separate implicit conversion for arrays, casting to WrappedArray, and so on.

编辑:我喜欢的方法来处理数组和字符串,这样使代码甚至更多通用,然后使用适当的隐式转换使它们更具体地再次以数组工作的方式。在此特殊情况下:

My favored approach for dealing with arrays and strings and such is to make the code even more generic and then use appropriate implicit conversions to make them more specific again in such a way that arrays work also. In this particular case:

class GroupingCollection[A, C, D[C]](ca: C)(
  implicit c2i: C => Iterable[A],
           cbf: CanBuildFrom[C,C,D[C]],
           cbfi: CanBuildFrom[C,A,C]
) {
  def groupedWhile(p: (A,A) => Boolean): D[C] = {
    val it = c2i(ca).iterator
    val cca = cbf()
    if (!it.hasNext) cca.result
    else {
      val as = cbfi()
      var olda = it.next
      as += olda
      while (it.hasNext) {
        val a = it.next
        if (p(olda,a)) as += a
        else { cca += as.result; as.clear; as += a }
        olda = a
      }
      cca += as.result
    }
    cca.result
  }
}

这里我们添加了一个隐式的,给我们一个 Iterable [A] / code>从 C - 对于大多数集合,这将只是身份(例如 List [A] 已经是一个 Iterable [A] ),但对于数组,它将是一个真正的隐式转换。因此,我们已经放弃了 C [A]<:Iterable [A] 的要求 - 我们基本上只是要求<% explicit,所以我们可以明确使用它,而不是让编译器为我们填充它。此外,我们放宽了我们的集合集合 C [C [A]] 的限制 - 而是任何 D [C ] ,我们将在以后填写,我们想要的。因为我们将在以后填写这个,所以我们将它推送到类级别而不是方法级别。否则,它基本上是一样的。

Here we've added an implicit that gives us an Iterable[A] from C--for most collections this will just be the identity (e.g. List[A] already is an Iterable[A]), but for arrays it will be a real implicit conversion. And, consequently, we've dropped the requirement that C[A] <: Iterable[A]--we've basically just made the requirement for <% explicit, so we can use it explicitly at will instead of having the compiler fill it in for us. Also, we have relaxed the restriction that our collection-of-collections is C[C[A]]--instead, it's any D[C], which we will fill in later to be what we want. Because we're going to fill this in later, we've pushed it up to the class level instead of the method level. Otherwise, it's basically the same.

现在的问题是如何使用这个。对于常规集合,我们可以:

Now the question is how to use this. For regular collections, we can:

implicit def collections_have_grouping[A, C[A]](ca: C[A])(
  implicit c2i: C[A] => Iterable[A],
           cbf: CanBuildFrom[C[A],C[A],C[C[A]]],
           cbfi: CanBuildFrom[C[A],A,C[A]]
) = {
  new GroupingCollection[A,C[A],C](ca)(c2i, cbf, cbfi)
}

现在我们插入 C [A] / code>为 C C [C [A]] [C] 。注意,我们需要在 new GroupingCollection 的调用上显式泛型类型,所以它可以保持直线哪些类型对应于什么。感谢 implicit c2i:C [A] => Iterable [A] ,这会自动处理数组。

where now we plug in C[A] for C and C[C[A]] for D[C]. Note that we do need the explicit generic types on the call to new GroupingCollection so it can keep straight which types correspond to what. Thanks to the implicit c2i: C[A] => Iterable[A], this automatically handles arrays.

但等待,如果我们想使用字符串怎么办?现在我们有麻烦,因为你不能有一个字符串。这是额外抽象的帮助:我们可以调用 D 适合保存字符串的东西。让我们选择 Vector ,然后执行以下操作:

But wait, what if we want to use strings? Now we're in trouble, because you can't have a "string of strings". This is where the extra abstraction helps: we can call D something that's suitable to hold strings. Let's pick Vector, and do the following:

val vector_string_builder = (
  new CanBuildFrom[String, String, Vector[String]] {
    def apply() = Vector.newBuilder[String]
    def apply(from: String) = this.apply()
  }
)

implicit def strings_have_grouping(s: String)(
  implicit c2i: String => Iterable[Char],
           cbfi: CanBuildFrom[String,Char,String]
) = {
  new GroupingCollection[Char,String,Vector](s)(
    c2i, vector_string_builder, cbfi
  )
}

我们需要一个新的 CanBuildFrom 的字符串向量(但这很容易,因为我们只需要调用 Vector.newBuilder [String] ),然后我们需要填写所有的类型 GroupingCollection 被类型化。请注意,我们已经在 [String,Char,String] 周围浮动了CanBuildFrom,所以字符串可以从字符集合。

We need a new CanBuildFrom to handle the building of a vector of strings (but this is really easy, since we just need to call Vector.newBuilder[String]), and then we need to fill in all the types so that the GroupingCollection is typed sensibly. Note that we already have floating around a [String,Char,String] CanBuildFrom, so strings can be made from collections of chars.

让我们试试:

scala> List(true,false,true,true,true).groupedWhile(_ == _)
res1: List[List[Boolean]] = List(List(true), List(false), List(true, true, true))

scala> Array(1,2,5,3,5,6,7,4,1).groupedWhile(_ <= _) 
res2: Array[Array[Int]] = Array(Array(1, 2, 5), Array(3, 5, 6, 7), Array(4), Array(1))

scala> "Hello there!!".groupedWhile(_.isLetter == _.isLetter)
res3: Vector[String] = Vector(Hello,  , there, !!)

这篇关于如何将enrich-my-library模式应用于Scala集合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆