有效的方式来折叠scala中的列表，同时避免分配和变量 [英] Efficient way to fold list in scala, while avoiding allocations and vars

查看：114 发布时间：2018/4/18 16:04:10 scala optimization functional-programming

本文介绍了有效的方式来折叠scala中的列表，同时避免分配和变量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在列表中有很多项目，我需要分析内容以找出其中有多少项是完整的。我开始使用分区，但后来意识到我不需要返回两个列表，所以我切换到一个折叠：

I have a bunch of items in a list, and I need to analyze the content to find out how many of them are "complete". I started out with partition, but then realized that I didn't need to two lists back, so I switched to a fold:

val counts = groupRows.foldLeft( (0,0) )( (pair, row) => 
     if(row.time == 0) (pair._1+1,pair._2) 
     else (pair._1, pair._2+1)
   )

但我有许多并行用户需要经历很多行，并且导致很多GC活动（我的假设...... GC 可能来自其他事物，但我怀疑这一点因为我知道它会在每个折叠项上分配一个新的元组）。

but I have a lot of rows to go through for a lot of parallel users, and it is causing a lot of GC activity (assumption on my part...the GC could be from other things, but I suspect this since I understand it will allocate a new tuple on every item folded).

目前，我已将其重写为

for the time being, I've rewritten this as

var complete = 0 var incomplete = 0 list.foreach(row => if(row.time != 0) complete += 1 else incomplete += 1)

修正了GC，但引入了变数。

which fixes the GC, but introduces vars.

我想知道是否有一种方法可以在不使用变量的情况下执行此操作，同时也不会滥用GC？

I was wondering if there was a way of doing this without using vars while also not abusing the GC?

编辑：

一个var实现在大型列表上的速度似乎要快得多（比如40％），甚至比一个功能更强大但应该相当的尾递归优化版本更快。

Hard call on the answers I've received. A var implementation seems to be considerably faster on large lists (like by 40%) than even a tail-recursive optimized version that is more functional but should be equivalent.

dhg的第一个答案似乎与尾递归的性能相当，这意味着size pass是超级高效的......事实上，优化后的运行速度比我的尾递归的运行速度稍快硬件。

The first answer from dhg seems to be on-par with the performance of the tail-recursive one, implying that the size pass is super-efficient...in fact, when optimized it runs very slightly faster than the tail-recursive one on my hardware.

推荐答案

我看到您已经接受了一个答案，但您正确地提到该解决方案将遍历列表两次。

I see you've already accepted an answer, but you rightly mention that that solution will traverse the list twice. The way to do it efficiently is with recursion.

def counts(xs: List[...], complete: Int = 0, incomplete: Int = 0): (Int,Int) = xs match { case Nil => (complete, incomplete) case row :: tail => if (row.time == 0) counts(tail, complete + 1, incomplete) else counts(tail, complete, incomplete + 1) }

这实际上只是一个定制的 fold ，除了我们使用2个累加器只是 Int s（原语）而不是元组（引用类型）。它也应该与使用变量的while循环一样高效 - 事实上，字节码应该是相同的。

This is effectively just a customized fold, except we use 2 accumulators which are just Ints (primitives) instead of tuples (reference types). It should also be just as efficient a while-loop with vars - in fact, the bytecode should be identical.

这篇关于有效的方式来折叠scala中的列表，同时避免分配和变量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

有效的方式来折叠scala中的列表，同时避免分配和变量 [英] Efficient way to fold list in scala, while avoiding allocations and vars

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

有效的方式来折叠scala中的列表，同时避免分配和变量 [英] Efficient way to fold list in scala, while avoiding allocations and vars

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭