如何将丰富我的库模式应用于 Scala 集合? [英] How do I apply the enrich-my-library pattern to Scala collections?

查看:27
本文介绍了如何将丰富我的库模式应用于 Scala 集合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Scala 中最强大的模式之一是enrich-my-library* 模式,它使用隐式转换出现 向现有类添加方法,而无需动态方法解析.例如,如果我们希望所有字符串都有方法 spaces 来计算它们有多少空白字符,我们可以:

One of the most powerful patterns available in Scala is the enrich-my-library* pattern, which uses implicit conversions to appear to add methods to existing classes without requiring dynamic method resolution. For example, if we wished that all strings had the method spaces that counted how many whitespace characters they had, we could:

class SpaceCounter(s: String) {
  def spaces = s.count(_.isWhitespace)
}
implicit def string_counts_spaces(s: String) = new SpaceCounter(s)

scala> "How many spaces do I have?".spaces
res1: Int = 5

不幸的是,这种模式在处理泛型集合时遇到了麻烦.例如,关于按集合对项目进行分组,已经提出了许多问题.没有任何内置的东西可以一次性使用,所以这似乎是使用泛型集合 C 和泛型元素类型 A 的丰富我的库模式的理想候选者:

Unfortunately, this pattern runs into trouble when dealing with generic collections. For example, a number of questions have been asked about grouping items sequentially with collections. There is nothing built in that works in one shot, so this seems an ideal candidate for the enrich-my-library pattern using a generic collection C and a generic element type A:

class SequentiallyGroupingCollection[A, C[A] <: Seq[A]](ca: C[A]) {
  def groupIdentical: C[C[A]] = {
    if (ca.isEmpty) C.empty[C[A]]
    else {
      val first = ca.head
      val (same,rest) = ca.span(_ == first)
      same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
    }
  }
}

当然,除了它不起作用.REPL 告诉我们:

except, of course, it doesn't work. The REPL tells us:

<console>:12: error: not found: value C
               if (ca.isEmpty) C.empty[C[A]]
                               ^
<console>:16: error: type mismatch;
 found   : Seq[Seq[A]]
 required: C[C[A]]
                 same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
                      ^

有两个问题:我们如何从一个空的 C[A] 列表(或凭空)得到一个 C[C[A]] ?我们如何从 same +: 行返回一个 C[C[A]] 而不是 Seq[Seq[A]]>?

There are two problems: how do we get a C[C[A]] from an empty C[A] list (or from thin air)? And how do we get a C[C[A]] back from the same +: line instead of a Seq[Seq[A]]?

* 以前称为 pimp-my-library.

推荐答案

理解这个问题的关键是要认识到在集合库中有两种不同的方式来构建和使用集合.一个是公共集合接口及其所有好的方法.另一个在创建集合库中被广泛使用,但在它之外几乎从未使用过,是构建器.

The key to understanding this problem is to realize that there are two different ways to build and work with collections in the collections library. One is the public collections interface with all its nice methods. The other, which is used extensively in creating the collections library, but which are almost never used outside of it, is the builders.

我们丰富的问题​​与集合库本身在尝试返回相同类型的集合时面临的问题完全相同.也就是说,我们想要构建集合,但是在一般工作时,我们没有办法引用与集合已经是相同的类型".所以我们需要建设者.

Our problem in enriching is exactly the same one that the collections library itself faces when trying to return collections of the same type. That is, we want to build collections, but when working generically, we don't have a way to refer to "the same type that the collection already is". So we need builders.

现在的问题是:我们从哪里获得构建器?最明显的地方是收藏本身.这不起作用.我们已经决定,在转向泛型集合时,我们将忘记集合的类型.因此,即使该集合可以返回一个生成器来生成更多我们想要的类型的集合,它也不知道该类型是什么.

Now the question is: where do we get our builders from? The obvious place is from the collection itself. This doesn't work. We already decided, in moving to a generic collection, that we were going to forget the type of the collection. So even though the collection could return a builder that would generate more collections of the type we want, it wouldn't know what the type was.

相反,我们从浮动的 CanBuildFrom 隐式获取我们的构建器.这些专门用于匹配输入和输出类型并为您提供适当类型的构建器.

Instead, we get our builders from CanBuildFrom implicits that are floating around. These exist specifically for the purpose of matching input and output types and giving you an appropriately typed builder.

因此,我们有两个概念上的飞跃:

So, we have two conceptual leaps to make:

  1. 我们使用的不是标准集合操作,而是构建器.
  2. 我们从隐式 CanBuildFrom 中获取这些构建器,而不是直接从我们的集合中获取.
  1. We aren't using standard collections operations, we're using builders.
  2. We get these builders from implicit CanBuildFroms, not from our collection directly.

让我们看一个例子.

class GroupingCollection[A, C[A] <: Iterable[A]](ca: C[A]) {
  import collection.generic.CanBuildFrom
  def groupedWhile(p: (A,A) => Boolean)(
    implicit cbfcc: CanBuildFrom[C[A],C[A],C[C[A]]], cbfc: CanBuildFrom[C[A],A,C[A]]
  ): C[C[A]] = {
    val it = ca.iterator
    val cca = cbfcc()
    if (!it.hasNext) cca.result
    else {
      val as = cbfc()
      var olda = it.next
      as += olda
      while (it.hasNext) {
        val a = it.next
        if (p(olda,a)) as += a
        else { cca += as.result; as.clear; as += a }
        olda = a
      }
      cca += as.result
    }
    cca.result
  }
}
implicit def iterable_has_grouping[A, C[A] <: Iterable[A]](ca: C[A]) = {
  new GroupingCollection[A,C](ca)
}

让我们把它拆开.首先,为了构建集合的集合,我们知道我们需要构建两种类型的集合:每个组的 C[A]C[C[A]]] 将所有组聚集在一起.因此,我们需要两个构建器,一个需要 As 并构建 C[A]s,另一个需要 C[A]s并构建 C[C[A]]s.查看CanBuildFrom的类型签名,我们看到

Let's take this apart. First, in order to build the collection-of-collections, we know we'll need to build two types of collections: C[A] for each group, and C[C[A]] that gathers all the groups together. Thus, we need two builders, one that takes As and builds C[A]s, and one that takes C[A]s and builds C[C[A]]s. Looking at the type signature of CanBuildFrom, we see

CanBuildFrom[-From, -Elem, +To]

这意味着 CanBuildFrom 想知道我们开始使用的集合类型——在我们的例子中,它是 C[A],然后是生成的集合的元素和类型那个集合.所以我们将它们作为隐式参数cbfcccbfc 填入.

which means that CanBuildFrom wants to know the type of collection we're starting with--in our case, it's C[A], and then the elements of the generated collection and the type of that collection. So we fill those in as implicit parameters cbfcc and cbfc.

意识到这一点后,这就是大部分工作.我们可以使用我们的 CanBuildFrom 为我们提供构建器(您需要做的就是应用它们).一个构建器可以使用 += 构建一个集合,将其转换为它最终应该使用 result 的集合,然后清空自身并准备重新开始清除.构建器从空开始,这解决了我们的第一个编译错误,并且由于我们使用构建器而不是递归,第二个错误也消失了.

Having realized this, that's most of the work. We can use our CanBuildFroms to give us builders (all you need to do is apply them). And one builder can build up a collection with +=, convert it to the collection it is supposed to ultimately be with result, and empty itself and be ready to start again with clear. The builders start off empty, which solves our first compile error, and since we're using builders instead of recursion, the second error also goes away.

最后一个小细节——除了实际完成工作的算法——是隐式转换.请注意,我们使用 new GroupingCollection[A,C] 而不是 [A,C[A]].这是因为类声明是针对带有一个参数的 C 的,它用传递给它的 A 填充它自己.因此,我们只需将类型C 交给它,然后让它从中创建C[A].次要细节,但如果您尝试另一种方式,则会出现编译时错误.

One last little detail--other than the algorithm that actually does the work--is in the implicit conversion. Note that we use new GroupingCollection[A,C] not [A,C[A]]. This is because the class declaration was for C with one parameter, which it fills it itself with the A passed to it. So we just hand it the type C, and let it create C[A] out of it. Minor detail, but you'll get compile-time errors if you try another way.

在这里,我使该方法比相等元素"集合更通用——相反,只要对顺序元素的测试失败,该方法就会将原始集合切开.

Here, I've made the method a little bit more generic than the "equal elements" collection--rather, the method cuts the original collection apart whenever its test of sequential elements fails.

让我们看看我们的方法:

Let's see our method in action:

scala> List(1,2,2,2,3,4,4,4,5,5,1,1,1,2).groupedWhile(_ == _)
res0: List[List[Int]] = List(List(1), List(2, 2, 2), List(3), List(4, 4, 4), 
                             List(5, 5), List(1, 1, 1), List(2))

scala> Vector(1,2,3,4,1,2,3,1,2,1).groupedWhile(_ < _)
res1: scala.collection.immutable.Vector[scala.collection.immutable.Vector[Int]] =
  Vector(Vector(1, 2, 3, 4), Vector(1, 2, 3), Vector(1, 2), Vector(1))

有效!

唯一的问题是我们通常没有可用于数组的这些方法,因为这需要连续进行两次隐式转换.有几种方法可以解决这个问题,包括为数组编写单独的隐式转换、转换为 WrappedArray 等.

The only problem is that we don't in general have these methods available for arrays, since that would require two implicit conversions in a row. There are several ways to get around this, including writing a separate implicit conversion for arrays, casting to WrappedArray, and so on.

我最喜欢处理数组和字符串等的方法是使代码通用,然后使用适当的隐式转换使它们再次以数组工作的方式更加具体还.在这种特殊情况下:

My favored approach for dealing with arrays and strings and such is to make the code even more generic and then use appropriate implicit conversions to make them more specific again in such a way that arrays work also. In this particular case:

class GroupingCollection[A, C, D[C]](ca: C)(
  implicit c2i: C => Iterable[A],
           cbf: CanBuildFrom[C,C,D[C]],
           cbfi: CanBuildFrom[C,A,C]
) {
  def groupedWhile(p: (A,A) => Boolean): D[C] = {
    val it = c2i(ca).iterator
    val cca = cbf()
    if (!it.hasNext) cca.result
    else {
      val as = cbfi()
      var olda = it.next
      as += olda
      while (it.hasNext) {
        val a = it.next
        if (p(olda,a)) as += a
        else { cca += as.result; as.clear; as += a }
        olda = a
      }
      cca += as.result
    }
    cca.result
  }
}

这里我们添加了一个隐式,它为我们提供了来自 CIterable[A]——对于大多数集合,这只是标识(例如 List[A] 已经是一个 Iterable[A]),但对于数组,它将是一个真正的隐式转换.因此,我们已经放弃了 C[A] <: Iterable[A] 的要求——我们基本上只是提出了对 <% 的要求显式,所以我们可以随意显式地使用它,而不是让编译器为我们填写它.此外,我们放宽了集合集合为 C[C[A]] 的限制——取而代之的是任何 D[C],我们将稍后填写成为我们想要的.因为我们稍后会填充它,所以我们将它推到了类级别而不是方法级别.否则,基本相同.

Here we've added an implicit that gives us an Iterable[A] from C--for most collections this will just be the identity (e.g. List[A] already is an Iterable[A]), but for arrays it will be a real implicit conversion. And, consequently, we've dropped the requirement that C[A] <: Iterable[A]--we've basically just made the requirement for <% explicit, so we can use it explicitly at will instead of having the compiler fill it in for us. Also, we have relaxed the restriction that our collection-of-collections is C[C[A]]--instead, it's any D[C], which we will fill in later to be what we want. Because we're going to fill this in later, we've pushed it up to the class level instead of the method level. Otherwise, it's basically the same.

现在的问题是如何使用它.对于常规集合,我们可以:

Now the question is how to use this. For regular collections, we can:

implicit def collections_have_grouping[A, C[A]](ca: C[A])(
  implicit c2i: C[A] => Iterable[A],
           cbf: CanBuildFrom[C[A],C[A],C[C[A]]],
           cbfi: CanBuildFrom[C[A],A,C[A]]
) = {
  new GroupingCollection[A,C[A],C](ca)(c2i, cbf, cbfi)
}

现在我们为 C 插入 C[A] 和为 D[C 插入 C[C[A]]].请注意,我们确实需要在调用 new GroupingCollection 时使用显式泛型类型,以便它可以明确哪些类型对应于哪些类型.感谢 隐式 c2i:C[A] =>Iterable[A],自动处理数组.

where now we plug in C[A] for C and C[C[A]] for D[C]. Note that we do need the explicit generic types on the call to new GroupingCollection so it can keep straight which types correspond to what. Thanks to the implicit c2i: C[A] => Iterable[A], this automatically handles arrays.

但是等等,如果我们想使用字符串怎么办?现在我们有麻烦了,因为你不能有一个字符串字符串".这就是额外的抽象有帮助的地方:我们可以调用 D 一些适合保存字符串的东西.让我们选择 Vector,然后执行以下操作:

But wait, what if we want to use strings? Now we're in trouble, because you can't have a "string of strings". This is where the extra abstraction helps: we can call D something that's suitable to hold strings. Let's pick Vector, and do the following:

val vector_string_builder = (
  new CanBuildFrom[String, String, Vector[String]] {
    def apply() = Vector.newBuilder[String]
    def apply(from: String) = this.apply()
  }
)

implicit def strings_have_grouping(s: String)(
  implicit c2i: String => Iterable[Char],
           cbfi: CanBuildFrom[String,Char,String]
) = {
  new GroupingCollection[Char,String,Vector](s)(
    c2i, vector_string_builder, cbfi
  )
}

我们需要一个新的 CanBuildFrom 来处理字符串向量的构建(但这真的很容易,因为我们只需要调用 Vector.newBuilder[String]),然后我们需要填写所有类型,以便 GroupingCollection 被合理地键入.请注意,我们已经在 [String,Char,String] CanBuildFrom 周围浮动,因此可以从字符集合中生成字符串.

We need a new CanBuildFrom to handle the building of a vector of strings (but this is really easy, since we just need to call Vector.newBuilder[String]), and then we need to fill in all the types so that the GroupingCollection is typed sensibly. Note that we already have floating around a [String,Char,String] CanBuildFrom, so strings can be made from collections of chars.

让我们试一试:

scala> List(true,false,true,true,true).groupedWhile(_ == _)
res1: List[List[Boolean]] = List(List(true), List(false), List(true, true, true))

scala> Array(1,2,5,3,5,6,7,4,1).groupedWhile(_ <= _) 
res2: Array[Array[Int]] = Array(Array(1, 2, 5), Array(3, 5, 6, 7), Array(4), Array(1))

scala> "Hello there!!".groupedWhile(_.isLetter == _.isLetter)
res3: Vector[String] = Vector(Hello,  , there, !!)

这篇关于如何将丰富我的库模式应用于 Scala 集合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆