为什么是Scala“for循环理解”所以与FOR循环相比非常慢? [英] Why are Scala "for loop comprehensions" so very slow compared to FOR loops?

查看:302
本文介绍了为什么是Scala“for循环理解”所以与FOR循环相比非常慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人说,斯卡拉理解其实很慢。我给出的理由是,由于Java的限制,为了理解(比如reduce,下面使用)需要为每个迭代生成一个临时对象,以便调用传入的函数。

是...这是...真的吗?下面的测试似乎证实了这一点,但我不完全明白为什么这样。



这可能对lambdas或匿名函数有意义,但不能对于非匿名函数。

在我的测试中,我运行了针对list.reduce的循环(见下面的代码),发现它们的速度是以前的两倍当每个迭代调用完全相同的函数,传递给reduce!



我觉得这非常反直觉(一旦认为Scala库会被仔细创建在一个测试中,我放在一起,运行相同的循环(总计数字1百万,而不管溢出)五不同的方式:


  1. for循环数组值

  2. for循环,但调用一个函数而不是内联算术

  3. for循环,创建一个包含附加函数的对象
  4. list.reduce,传递一个匿名的乐趣ction

  5. list.reduce传入一个对象成员函数


    结果如下:
    test:min / max / average(毫秒)

      1。 27/157 / 64.78 
    2.27 / 192 / 65.77 <---注意测试1,2和4,5
    之间的相似性3.139 / 313 / 202.58
    4。 63/342 / 150.18
    5. 63/341 / 149.99

    可以看出, 理解版本的顺序是for for new for each instance,这意味着实际上可以为匿名函数和非匿名函数版本执行新。

    方法:将下面的代码(除去测试调用)编译成一个.jar文件,以确保所有版本运行相同的库代码。每个迭代中的每个测试都在一个新的JVM(即每个测试的scala -cp ...)中调用,以消除堆大小问题。

      class t(val i:Int){
    def summit(j:Int)= j + i
    }

    object bar {
    val biglist:List [Int] =(1 to 1000000).toList

    def summit(i:Int,j:Int)= i + j

    // Simple for loop
    def forloop:Int = {
    var result:Int = 0
    for(i< - biglist){
    result + = i
    }
    结果
    }

    //用于函数循环而不是内联数学
    def forloop2:Int = {
    var result:Int = 0
    for (i< - biglist){
    result = summit(result,i)
    }
    结果
    }

    //用于生成循环对象PER迭代
    def forloop3:Int = {
    var result:Int = 0
    for(i< - biglist){
    val t = n ew t(result)
    result = t.summit(i)
    }
    result
    }

    //带有一个匿名函数的list.reduce in
    def anonymousfunc:Int = {
    biglist.reduce((i,j)=> {i + j})
    }

    //带有命名函数的$ list.reduce
    def realfunc:Int = {
    biglist.reduce(summit)
    }

    //为简洁起见,删除了测试调用代码。举一个例子:
    args(0)match {
    case1=> {
    val start = System.currentTimeMillis()
    forloop
    val end = System.currentTimeMillis()
    println(for =+(end - start))
    }
    ...
    }


    解决方案

    你们被告知理解是真的,但你们的问题是,你们把理解和匿名功能混在一起了。

    为理解在Scala中是一系列 .flatMap .map .filter 应用程序。由于您正在测试简化算法,并且由于使用这三个函数实现简化算法是不可能的,所以您的测试用例是不正确的。

    下面是一个for comprehension :

      val listOfLists = List(List(1,2),List(3,4),List(5))
    val result =
    for {
    itemOfListOfLists< - listOfLists
    itemOfItemOfListOfLists< - itemOfListOfLists
    }
    yield(itemOfItemOfListOfLists + 1)
    assert (result == List(2,3,4,5,6))

    编译器解析理解部分如下:

      val结果= 
    listOfLists.flatMap(
    itemOfListOfLists => itemOfListOfLists .map(
    itemOfItemOfListOfLists => itemOfItemOfListOfLists + 1




    然后解除匿名函数的语法:

    $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ listOfLists.flatMap(
    $ new Function1 [List [Int],List [Int]] {
    override def apply(itemOfListOfLists:List [Int]):List [Int] =
    itemOfListOfLists.map(
    new Function1 [Int,Int] {
    override def apply(itemOfItemOfListOfLists:Int):Int =
    itemOfItemOfListOfLists + 1
    }

    }





    $ b

    从现在的代码可以看出, Function1 [Int,Int] 类在每次应用(itemOfListOfLists:List [Int]):List [Int] 方法被调用时被实例化。这发生在 listOfLists 的每个条目中。所以你理解的越复杂,你得到的 Function 对象的实例化越多。


    It has been said that Scala For comprehensions are actually quite slow. The reason I was given was that due to a Java limitation, for comprehensions (such as "reduce", used below) need to generate a temporary object with each iteration in order to call the function passed in.

    IS...THIS...TRUE? The tests below seem to bear this out, but I do not completely understand why this is the case.

    This might make sense for "lambdas" or anonymous functions, but not for non-anonymous functions.

    In my test, I ran for loops against list.reduce (see code below), and found them to be over twice as fast, even when each iteration called the exact same function that was passed to reduce!

    I find this amazingly counter-intuitive (once would think that the Scala library would've been carefully created to be as optimal as possible).

    In a test I put together, I ran the same loop (sum up the numbers 1 to a million, irrespective of overflow) five different ways:

    1. for loop over the array of values
    2. for loop, but calling a function instead of inline arithmetic
    3. for loop, creating an object that contains an addition function
    4. list.reduce, passing i an anonymous function
    5. list.reduce, passing in an object member function

    Results were as follows: test: min/max/average (milliseconds)

    1. 27/157/64.78
    2. 27/192/65.77 <--- note the similarity between tests 1,2 and 4,5
    3. 139/313/202.58
    4. 63/342/150.18
    5. 63/341/149.99
    

    As can be seen, the "for comprehension" versions are on the order of "for with new for each instance", implying that a "new" may in fact be performed for both the anonymous and non-anonymous function versions.

    Methodology: code below (test call removed) was compiled into a single .jar file to ensure all versions ran the same library code. Each test in each iteration was called in a new JVM (i.e. scala -cp ... for each test) in order to remove heap size issues.

    class t(val i: Int) {
        def summit(j: Int) = j + i
    }
    
    object bar {
        val biglist:List[Int]  =  (1 to 1000000).toList
    
        def summit(i: Int, j:Int) = i+j
    
        // Simple for loop
        def forloop:  Int = {
            var result: Int = 0
            for(i <- biglist) {
                result += i
            }
            result
        }
    
        // For loop with a function instead of inline math
        def forloop2:  Int = {
            var result: Int = 0
            for(i <- biglist) {
                result = summit(result,i)
            }
            result
        }
    
        // for loop with a generated object PER iteration
        def forloop3: Int = {
            var result: Int = 0
            for(i <- biglist) {
                val t = new t(result)
                result = t.summit(i)
            }
            result
        }
    
        // list.reduce with an anonymous function passed in
        def anonymousfunc: Int = {
            biglist.reduce((i,j) => {i+j})
        }
    
        // list.reduce with a named function
        def realfunc: Int = {
            biglist.reduce(summit)
        }
    
        // test calling code excised for brevity. One example given:
        args(0) match {
            case "1" => {
                        val start = System.currentTimeMillis()
                        forloop
                        val end = System.currentTimeMillis()
                        println("for="+(end - start))
                        }
             ...
    }
    

    解决方案

    What you were told was true about "for comprehensions", but the problem with your question is that you've mixed up "for comprehensions" with "anonymous functions".

    "For comprehension" in Scala is a syntactic sugar for a series of .flatMap, .map and .filter applications. Since you are testing reduction algorithms and since it's impossible to implement a reduction algorithm using those three functions, your test cases are incorrect.

    Here's an example of a "for comprehension":

    val listOfLists = List(List(1,2), List(3,4), List(5))
    val result = 
      for {
        itemOfListOfLists <- listOfLists
        itemOfItemOfListOfLists <- itemOfListOfLists
      }
      yield (itemOfItemOfListOfLists + 1)
    assert( result == List(2,3,4,5,6) )
    

    The compiler desugars the Comprehension part to the following:

    val result =
      listOfLists.flatMap(
        itemOfListOfLists => itemOfListOfLists.map(
          itemOfItemOfListOfLists => itemOfItemOfListOfLists + 1
        )
      )
    

    Then it desugars the Anonymous Function syntax:

    val result =
      listOfLists.flatMap(
        new Function1[List[Int], List[Int]] {
          override def apply(itemOfListOfLists: List[Int]): List[Int] =
            itemOfListOfLists.map(
              new Function1[Int, Int] {
                override def apply(itemOfItemOfListOfLists: Int): Int =
                  itemOfItemOfListOfLists + 1
              }
            )
        }
      )
    

    From the desugarred code it's now evident that the Function1[Int, Int] class gets instantiated every time the apply(itemOfListOfLists: List[Int]): List[Int] method gets called. That happens for every entry of listOfLists. So the more complex your comprehension is the more instantiations of the Function objects you get.

    这篇关于为什么是Scala“for循环理解”所以与FOR循环相比非常慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆