为什么在Scala中压缩速度比压缩速度快? [英] Why is zipped faster than zip in Scala?

查看:125
本文介绍了为什么在Scala中压缩速度比压缩速度快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经编写了一些Scala代码以对集合执行按元素操作.在这里,我定义了两种执行相同任务的方法.一种方法使用zip,另一种方法使用zipped.

I have written some Scala code to perform an element-wise operation on a collection. Here I defined two methods that perform the same task. One method uses zip and the other uses zipped.

def ES (arr :Array[Double], arr1 :Array[Double]) :Array[Double] = arr.zip(arr1).map(x => x._1 + x._2)

def ES1(arr :Array[Double], arr1 :Array[Double]) :Array[Double] = (arr,arr1).zipped.map((x,y) => x + y)

为了比较这两种方法的速度,我编写了以下代码:

To compare these two methods in terms of speed, I wrote the following code:

def fun (arr : Array[Double] , arr1 : Array[Double] , f :(Array[Double],Array[Double]) => Array[Double] , itr : Int) ={
  val t0 = System.nanoTime()
  for (i <- 1 to itr) {
       f(arr,arr1)
       }
  val t1 = System.nanoTime()
  println("Total Time Consumed:" + ((t1 - t0).toDouble / 1000000000).toDouble + "Seconds")
}

我调用fun方法并按如下所示传递ESES1:

I call the fun method and pass ES and ES1 as below:

fun(Array.fill(10000)(math.random), Array.fill(10000)(math.random), ES , 100000)
fun(Array.fill(10000)(math.random), Array.fill(10000)(math.random), ES1, 100000)

结果表明,使用zipped的名为ES1的方法比使用zip的方法ES快. 基于这些观察,我有两个问题.

The results show that the method named ES1 that uses zipped is faster than method ES that uses zip. Based on these observations, I have two questions.

为什么zippedzip快?

在Scala中,有没有更快的方法可以对集合进行按元素的操作?

Is there any even faster way to do element-wise operations on a collection in Scala?

推荐答案

回答第二个问题:

在Scala中是否有更快的方法来对集合进行元素明智的操作?

Is there any more faster way to do element wise operation on a collection in Scala?

可悲的事实是,尽管简洁,提高了生产效率,并且对功能语言不一定是性能最高的错误的适应能力强-使用高阶函数来定义要对不自由的集合执行的投影,并且循环紧密突出这一点.正如其他人指出的那样,用于中间结果和最终结果的额外存储分配也将产生开销.

The sad truth is that despite it's conciseness, improved productivity, and resilience to bugs that functional languages aren't necessarily the most performant - using higher order functions to define a projection to be executed against collections not free, and your tight loop highlights this. As others have pointed out, additional storage allocation for intermediate and final results will also have overhead.

如果性能至关重要,尽管这并不是通用的,但在像您这样的情况下,您可以将Scala的操作放回当务之急,以便重新获得对内存使用的更直接控制并消除函数调用.

If performance is critical, although by no means universal, in cases like yours you can unwind Scala's operations back into imperative equivalents in order to regain more direct control over memory usage and eliminating function calls.

在您的特定示例中,可以通过以下方式强制执行zipped之和:预先分配一个固定的,可变大小的,正确大小的可变数组(因为当集合中的一个元素用完时zip停止运行),然后在适当的索引(因为按序索引访问数组元素是非常快的操作).

In your specific example, the zipped sums can be performed imperatively by pre-allocating a fixed, mutable array of correct size (since zip stops when one of the collections runs out of elements), and then adding elements at the appropriate index together (since accessing array elements by ordinal index is a very fast operation).

在测试套件中添加第三个功能ES3:

Adding a third function, ES3 to your test suite:

def ES3(arr :Array[Double], arr1 :Array[Double]) :Array[Double] = {
   val minSize = math.min(arr.length, arr1.length)
   val array = Array.ofDim[Double](minSize)
   for (i <- 0 to minSize - 1) {
     array(i) = arr(i) + arr1(i)
   }
  array
}

在我的i7上,我得到以下响应时间:

On my i7 I get the following response times:

OP ES Total Time Consumed:23.3747857Seconds
OP ES1 Total Time Consumed:11.7506995Seconds
--
ES3 Total Time Consumed:1.0255231Seconds

甚至更令人发指的是直接对两个数组中较短的数组进行原位突变,这显然会破坏其中一个数组的内容,并且只有在不再需要原始数组的情况下才会这样做:

Even more heineous would be to do direct in-place mutation of the shorter of the two arrays, which would obviously corrupt the contents of one of the arrays, and would only be done if the original array again wouldn't be needed:

def ES4(arr :Array[Double], arr1 :Array[Double]) :Array[Double] = {
   val minSize = math.min(arr.length, arr1.length)
   val array = if (arr.length < arr1.length) arr else arr1
   for (i <- 0 to minSize - 1) {
      array(i) = arr(i) + arr1(i)
   }
  array
}

Total Time Consumed:0.3542098Seconds

但是很明显,数组元素的直接突变不是Scala的精神.

But obviously, direct mutation of array elements isn't in the spirit of Scala.

这篇关于为什么在Scala中压缩速度比压缩速度快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆