为什么 Scala 是“for 循环推导式"?与 FOR 循环相比非常慢? [英] Why are Scala "for loop comprehensions" so very slow compared to FOR loops?

查看:89
本文介绍了为什么 Scala 是“for 循环推导式"?与 FOR 循环相比非常慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据说Scala For comprehensions实际上很慢.给我的原因是由于Java的限制,对于推导式(例如reduce",下面使用)需要在每次迭代时生成一个临时对象,以便调用传入的函数.

It has been said that Scala For comprehensions are actually quite slow. The reason I was given was that due to a Java limitation, for comprehensions (such as "reduce", used below) need to generate a temporary object with each iteration in order to call the function passed in.

这是...这是...真的吗?下面的测试似乎证实了这一点,但我不完全理解为什么会这样.

IS...THIS...TRUE? The tests below seem to bear this out, but I do not completely understand why this is the case.

这可能对lambdas"或匿名函数有意义,但对非匿名函数无效.

This might make sense for "lambdas" or anonymous functions, but not for non-anonymous functions.

在我的测试中,我针对 list.reduce 运行了 for 循环(参见下面的代码),发现它们的速度提高了两倍,即使每次迭代调用传递给 reduce 的完全相同的函数!

In my test, I ran for loops against list.reduce (see code below), and found them to be over twice as fast, even when each iteration called the exact same function that was passed to reduce!

我发现这非常违反直觉(曾经认为 Scala 库会被精心创建以尽可能优化).

I find this amazingly counter-intuitive (once would think that the Scala library would've been carefully created to be as optimal as possible).

在我放在一起的测试中,我以五种不同的方式运行了相同的循环(将数字 1 加到一百万,不考虑溢出):

In a test I put together, I ran the same loop (sum up the numbers 1 to a million, irrespective of overflow) five different ways:

  1. for 循环值数组
  2. for 循环,但调用函数而不是内联算术
  3. for 循环,创建一个包含加法函数的对象
  4. list.reduce,传递给我一个匿名函数
  5. list.reduce,传入一个对象成员函数

结果如下:测试:最小/最大/平均值(毫秒)

Results were as follows: test: min/max/average (milliseconds)

1. 27/157/64.78
2. 27/192/65.77 <--- note the similarity between tests 1,2 and 4,5
3. 139/313/202.58
4. 63/342/150.18
5. 63/341/149.99

可以看出,for comprehension"版本的顺序是for with new for each instance",这意味着实际上可以为匿名和非匿名函数版本执行new".

As can be seen, the "for comprehension" versions are on the order of "for with new for each instance", implying that a "new" may in fact be performed for both the anonymous and non-anonymous function versions.

方法:下面的代码(删除了测试调用)被编译成一个单一的 .jar 文件,以确保所有版本运行相同的库代码.每次迭代中的每个测试都在新的 JVM 中调用(即每个测试的 scala -cp ...),以消除堆大小问题.

Methodology: code below (test call removed) was compiled into a single .jar file to ensure all versions ran the same library code. Each test in each iteration was called in a new JVM (i.e. scala -cp ... for each test) in order to remove heap size issues.

class t(val i: Int) {
    def summit(j: Int) = j + i
}

object bar {
    val biglist:List[Int]  =  (1 to 1000000).toList

    def summit(i: Int, j:Int) = i+j

    // Simple for loop
    def forloop:  Int = {
        var result: Int = 0
        for(i <- biglist) {
            result += i
        }
        result
    }

    // For loop with a function instead of inline math
    def forloop2:  Int = {
        var result: Int = 0
        for(i <- biglist) {
            result = summit(result,i)
        }
        result
    }

    // for loop with a generated object PER iteration
    def forloop3: Int = {
        var result: Int = 0
        for(i <- biglist) {
            val t = new t(result)
            result = t.summit(i)
        }
        result
    }

    // list.reduce with an anonymous function passed in
    def anonymousfunc: Int = {
        biglist.reduce((i,j) => {i+j})
    }

    // list.reduce with a named function
    def realfunc: Int = {
        biglist.reduce(summit)
    }

    // test calling code excised for brevity. One example given:
    args(0) match {
        case "1" => {
                    val start = System.currentTimeMillis()
                    forloop
                    val end = System.currentTimeMillis()
                    println("for="+(end - start))
                    }
         ...
}

推荐答案

你被告知的关于for comprehensions"是正确的,但你的问题的问题是你把for comprehensions"和anonymous functions"混淆了".

What you were told was true about "for comprehensions", but the problem with your question is that you've mixed up "for comprehensions" with "anonymous functions".

For comprehension"是一系列.flatMap.map.filter 应用程序的语法糖.由于您正在测试归约算法,并且不可能使用这三个函数来实现归约算法,因此您的测试用例是不正确的.

"For comprehension" in Scala is a syntactic sugar for a series of .flatMap, .map and .filter applications. Since you are testing reduction algorithms and since it's impossible to implement a reduction algorithm using those three functions, your test cases are incorrect.

这是一个for comprehension"的例子:

Here's an example of a "for comprehension":

val listOfLists = List(List(1,2), List(3,4), List(5))
val result = 
  for {
    itemOfListOfLists <- listOfLists
    itemOfItemOfListOfLists <- itemOfListOfLists
  }
  yield (itemOfItemOfListOfLists + 1)
assert( result == List(2,3,4,5,6) )

编译器将 Comprehension 部分脱糖为以下内容:

The compiler desugars the Comprehension part to the following:

val result =
  listOfLists.flatMap(
    itemOfListOfLists => itemOfListOfLists.map(
      itemOfItemOfListOfLists => itemOfItemOfListOfLists + 1
    )
  )

然后它对匿名函数语法进行脱糖:

Then it desugars the Anonymous Function syntax:

val result =
  listOfLists.flatMap(
    new Function1[List[Int], List[Int]] {
      override def apply(itemOfListOfLists: List[Int]): List[Int] =
        itemOfListOfLists.map(
          new Function1[Int, Int] {
            override def apply(itemOfItemOfListOfLists: Int): Int =
              itemOfItemOfListOfLists + 1
          }
        )
    }
  )

从脱糖代码现在很明显 Function1[Int, Int] 类在每次 apply(itemOfListOfLists: List[Int]): List[Int] 时被实例化代码> 方法被调用.listOfLists 的每个条目都会发生这种情况.因此,您的理解越复杂,您获得的 Function 对象的实例化就越多.

From the desugarred code it's now evident that the Function1[Int, Int] class gets instantiated every time the apply(itemOfListOfLists: List[Int]): List[Int] method gets called. That happens for every entry of listOfLists. So the more complex your comprehension is the more instantiations of the Function objects you get.

这篇关于为什么 Scala 是“for 循环推导式"?与 FOR 循环相比非常慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆