Scala函数式编程比传统编码慢吗? [英] Is Scala functional programming slower than traditional coding?

查看:99
本文介绍了Scala函数式编程比传统编码慢吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我从一个共同的任务开始 - 将两个数组的元素和总结结果:

  var first:Array [Float] ... 
var second:Array [Float] ...
var sum = 0f; (ix <-0直到first.length)
sum + = first(ix)* second(ix);

以下是我对这项工作进行改革的方式:

  sum = first.zip(second).map {case(a,b)=> a * b} .reduceLeft(_ + _)

当我对这两种方法进行基准测试时,第二种方法需要40倍的时间才能完成!

为什么第二种方法需要更长的时间?我怎样才能将工作改革成既高效又利用函数式编程风格? 解决方案

这两个例子的主要原因是速度不同的是:


  • 速度更快的人不使用任何泛型,因此不会面临装箱/拆箱。

    li>
  • 速度更快的人不会创建临时集合,从而避免了额外的内存副本。


让我们考虑慢一点的部分。第一:

  first.zip(second)

创建一个新数组,一个 Tuple2 数组。它会将两个数组中的所有元素复制到 Tuple2 对象中,然后将对这些对象的引用复制到第三个数组中。现在请注意 Tuple2 被参数化了,所以它不能直接存储 Float 。相反,为每个数字创建 java.lang.Float 的新实例,将数字存储在其中,然后将每个数字的引用存储到 Tuple2

  map {case(a,b)=> a * b} 

现在创建了第四个数组。为了计算这些元素的值,它需要从第三个数组读取对元组的引用,读取存储在它们中的对 java.lang.Float 的引用,读取数字相乘,创建一个新的 java.lang.Float 来存储结果,然后传回这个参考,它将是 de - 再次引用存储在数组中(数组不是类型擦除)。

尽管如此,我们还没有完成。这是下一部分:

  reduceLeft(_ + _)

这是一个相对无害的东西,除了它仍然在装箱/拆箱和 java.lang.Float 迭代创建,因为 reduceLeft 收到一个参数化的 Function2



< Scala 2.8引入了一个叫做specialization的特性,它将摆脱很多这些装箱/拆箱。但是让我们考虑替代更快的版本。例如,我们可以一步完成 map reduceLeft
$ (b,c))=>(b)(b) a + b * c}

我们可以使用 view (Scala 2.8)或 projection (Scala 2.7),以避免完全创建中间集合:

  sum = first.view.zip(second).map {case(a,b)=> a * b} .reduceLeft(_ + _)

最后一个节省不了多少,实际上,所以我认为非严格性如果被迷失得相当快(即其中一种方法即使在视图中也是严格的)。默认情况下,还有另一种压缩方式是非严格的(即避免了一些中间结果):

  sum =(第一,第二).zipped.map {case(a,b)=> a * b} .reduceLeft(_ + _)

这给了前者更好的结果。比 foldLeft 更好,但不是太多。不幸的是,我们无法将压缩 foldLeft 结合,因为前者不支持后者。



最后一个是我能得到的最快的。比这更快,只有专业化。现在, Function2 恰好是专用的,但对于 Int Long Double 。其他基元被遗漏,因为专门化会为每个基元增加代码大小。在我的测试中,尽管 Double 实际上需要更长的时间。这可能是它的两倍大小的结果,或者它可能是我做错了。



所以,最终,问题是因素,包括生成元素的中间副本,以及Java(JVM)处理基元和泛型的方式。 Haskell中使用超级编译的类似代码将等于任何缺少汇编程序的代码。在JVM上,您必须意识到权衡,并准备好优化关键代码。


In one of my first attempts to create functional code, I ran into a performance issue.

I started with a common task - multiply the elements of two arrays and sum up the results:

var first:Array[Float] ...
var second:Array[Float] ...    
var sum=0f; 
for (ix<-0 until first.length) 
    sum += first(ix) * second(ix);

Here is how I reformed the work:

sum = first.zip(second).map{ case (a,b) => a*b }.reduceLeft(_+_)

When I benchmarked the two approaches, the second method takes 40 times as long to complete!

Why does the second method take so much longer? How can I reform the work to be both speed efficient and use functional programming style?

解决方案

The main reasons why these two examples are so different in speed are:

  • the faster one doesn't use any generics, so it doesn't face boxing/unboxing.
  • the faster one doesn't create temporary collections and, thus, avoids extra memory copies.

Let's consider the slower one by parts. First:

first.zip(second)

That creates a new array, an array of Tuple2. It will copy all elements from both arrays into Tuple2 objects, and then copy a reference to each of these objects into a third array. Now, notice that Tuple2 is parameterized, so it can't store Float directly. Instead, new instances of java.lang.Float are created for each number, the numbers are stored in them, and then a reference for each of them is stored into the Tuple2.

map{ case (a,b) => a*b }

Now a fourth array is created. To compute the values of these elements, it needs to read the reference to the tuple from the third array, read the reference to the java.lang.Float stored in them, read the numbers, multiply, create a new java.lang.Float to store the result, and then pass this reference back, which will be de-referenced again to be stored in the array (arrays are not type-erased).

We are not finished, though. Here's the next part:

reduceLeft(_+_)

That one is relatively harmless, except that it still do boxing/unboxing and java.lang.Float creation at iteration, since reduceLeft receives a Function2, which is parameterized.

Scala 2.8 introduces a feature called specialization which will get rid of a lot of these boxing/unboxing. But let's consider alternative faster versions. We could, for instance, do map and reduceLeft in a single step:

sum = first.zip(second).foldLeft(0f) { case (a, (b, c)) => a + b * c }

We could use view (Scala 2.8) or projection (Scala 2.7) to avoid creating intermediary collections altogether:

sum = first.view.zip(second).map{ case (a,b) => a*b }.reduceLeft(_+_)

This last one doesn't save much, actually, so I think the non-strictness if being "lost" pretty fast (ie, one of these methods is strict even in a view). There's also an alternative way of zipping that is non-strict (ie, avoids some intermediary results) by default:

sum = (first,second).zipped.map{ case (a,b) => a*b }.reduceLeft(_+_)

This gives much better result that the former. Better than the foldLeft one, though not by much. Unfortunately, we can't combined zipped with foldLeft because the former doesn't support the latter.

The last one is the fastest I could get. Faster than that, only with specialization. Now, Function2 happens to be specialized, but for Int, Long and Double. The other primitives were left out, as specialization increases code size rather dramatically for each primitive. On my tests, though Double is actually taking longer. That might be a result of it being twice the size, or it might be something I'm doing wrong.

So, in the end, the problem is a combination of factors, including producing intermediary copies of elements, and the way Java (JVM) handles primitives and generics. A similar code in Haskell using supercompilation would be equal to anything short of assembler. On the JVM, you have to be aware of the trade-offs and be prepared to optimize critical code.

这篇关于Scala函数式编程比传统编码慢吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆