获取 Scala Iterable 的前 n 个元素的最简单方法 [英] Simplest way to get the top n elements of a Scala Iterable

查看:33
本文介绍了获取 Scala Iterable 的前 n 个元素的最简单方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一个简单有效的解决方案来确定 Scala Iterable 的前 n 个元素?我的意思是像

Is there a simple and efficient solution to determine the top n elements of a Scala Iterable? I mean something like

iter.toList.sortBy(_.myAttr).take(2)

但是当只有前 2 个感兴趣时不必对所有元素进行排序.理想情况下,我正在寻找类似的东西

but without having to sort all elements when only the top 2 are of interest. Ideally I'm looking for something like

iter.top(2, _.myAttr)

另见:使用排序的顶部元素的解决方案:在 Scala 中,如何将排序 [T] 与 List.min 或 List 一起使用.最大并保持代码可读

see also: Solution for the top element using an Ordering: In Scala, how to use Ordering[T] with List.min or List.max and keep code readable

感谢大家的解决方案.最后,我采用了 user unknown 的原始解决方案,并采用了它来使用 Iterablepimp-my-library 模式:

Thank you all for your solutions. Finally, I took the original solution of user unknown and adopted it to use Iterable and the pimp-my-library pattern:

implicit def iterExt[A](iter: Iterable[A]) = new {
  def top[B](n: Int, f: A => B)(implicit ord: Ordering[B]): List[A] = {
    def updateSofar (sofar: List [A], el: A): List [A] = {
      //println (el + " - " + sofar)

      if (ord.compare(f(el), f(sofar.head)) > 0)
        (el :: sofar.tail).sortBy (f)
      else sofar
    }

    val (sofar, rest) = iter.splitAt(n)
    (sofar.toList.sortBy (f) /: rest) (updateSofar (_, _)).reverse
  }
}

case class A(s: String, i: Int)
val li = List (4, 3, 6, 7, 1, 2, 9, 5).map(i => A(i.toString(), i))
println(li.top(3, _.i))

推荐答案

我的解决方案(绑定到 Int,但应该很容易更改为 Ordered(请几分钟):

My solution (bound to Int, but should be easily changed to Ordered (a few minutes please):

def top (n: Int, li: List [Int]) : List[Int] = {

  def updateSofar (sofar: List [Int], el: Int) : List [Int] = {
    // println (el + " - " + sofar)
    if (el < sofar.head) 
      (el :: sofar.tail).sortWith (_ > _) 
    else sofar
  }

  /* better readable:
    val sofar = li.take (n).sortWith (_ > _)
    val rest = li.drop (n)
    (sofar /: rest) (updateSofar (_, _)) */    
  (li.take (n). sortWith (_ > _) /: li.drop (n)) (updateSofar (_, _)) 
}

用法:

val li = List (4, 3, 6, 7, 1, 2, 9, 5)    
top (2, li)

  • 对于上面的列表,将前 2 (4, 3) 个作为起始 TopTen (TopTwo).
  • 对它们进行排序,使第一个元素是较大的元素(如果有).
  • 重复遍历列表的其余部分(li.drop(n)),并将当前元素与最小值列表的最大值进行比较;如有必要,更换并再次使用.
  • 改进:
    • 扔掉Int,使用ordered.
    • 扔掉 (_ > _) 并使用用户排序来允许BottomTen.(更难:选择中间的 10 个 :) )
    • 扔掉List,改用Iterable
    • def extremeN [T](n: Int, li: List [T])
        (comp1: ((T, T) => Boolean), comp2: ((T, T) => Boolean)):
           List[T] = {
      
        def updateSofar (sofar: List [T], el: T) : List [T] =
          if (comp1 (el, sofar.head)) 
            (el :: sofar.tail).sortWith (comp2 (_, _)) 
          else sofar
      
        (li.take (n) .sortWith (comp2 (_, _)) /: li.drop (n)) (updateSofar (_, _)) 
      }
      
      /*  still bound to Int:  
      def top (n: Int, li: List [Int]) : List[Int] = {
        extremeN (n, li) ((_ < _), (_ > _))
      }
      def bottom (n: Int, li: List [Int]) : List[Int] = {
        extremeN (n, li) ((_ > _), (_ < _))
      }
      */
      
      def top [T] (n: Int, li: List [T]) 
        (implicit ord: Ordering[T]): Iterable[T] = {
        extremeN (n, li) (ord.lt (_, _), ord.gt (_, _))
      }
      def bottom [T] (n: Int, li: List [T])
        (implicit ord: Ordering[T]): Iterable[T] = {
        extremeN (n, li) (ord.gt (_, _), ord.lt (_, _))
      }
      
      top (3, li)
      bottom (3, li)
      val sl = List ("Haus", "Garten", "Boot", "Sumpf", "X", "y", "xkcd", "x11")
      bottom (2, sl)
      

      用 Iterable 替换 List 似乎有点难.

      To replace List with Iterable seems to be a bit harder.

      正如 Daniel C. Sobral 在评论中指出的那样,topN 中的高 n 会导致大量的排序工作,因此进行手动插入排序而不是重复排序可能很有用前 n 个元素的整个列表:

      As Daniel C. Sobral pointed out in the comments, a high n in topN can lead to much sorting work, so that it could be useful, to do a manual insertion sort instead of repeatedly sorting the whole list of top-n elements:

      def extremeN [T](n: Int, li: List [T])
        (comp1: ((T, T) => Boolean), comp2: ((T, T) => Boolean)):
           List[T] = {
      
        def sortedIns (el: T, list: List[T]): List[T] = 
          if (list.isEmpty) List (el) else 
          if (comp2 (el, list.head)) el :: list else 
            list.head :: sortedIns (el, list.tail)
      
        def updateSofar (sofar: List [T], el: T) : List [T] =
          if (comp1 (el, sofar.head)) 
            sortedIns (el, sofar.tail)
          else sofar
      
        (li.take (n) .sortWith (comp2 (_, _)) /: li.drop (n)) (updateSofar (_, _)) 
      }
      

      top/bottom 方法和用法同上.对于一小组顶部/底部元素,很少会调用排序,开始时会调用几次,然后随着时间的推移越来越少.例如,顶部 (10) 为 10 000 时为 70 次,顶部 (10) 为 100 000 时为 90 次.

      top/bottom method and usage as above. For small groups of top/bottom Elements, the sorting is rarely called, a few times in the beginning, and then less and less often over time. For example, 70 times with top (10) of 10 000, and 90 times with top (10) of 100 000.

      这篇关于获取 Scala Iterable 的前 n 个元素的最简单方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆