为什么是Scala“for循环理解”所以与FOR循环相比非常慢? [英] Why are Scala "for loop comprehensions" so very slow compared to FOR loops?
问题描述
是...这是...真的吗?下面的测试似乎证实了这一点,但我不完全明白为什么这样。
这可能对lambdas或匿名函数有意义,但不能对于非匿名函数。
在我的测试中,我运行了针对list.reduce的循环(见下面的代码),发现它们的速度是以前的两倍当每个迭代调用完全相同的函数,传递给reduce!
我觉得这非常反直觉(一旦认为Scala库会被仔细创建在一个测试中,我放在一起,运行相同的循环(总计数字1百万,而不管溢出)五不同的方式:
- for循环数组值
- for循环,但调用一个函数而不是内联算术
- for循环,创建一个包含附加函数的对象
- list.reduce,传递一个匿名的乐趣ction
- list.reduce传入一个对象成员函数
结果如下:
test:min / max / average(毫秒)
1。 27/157 / 64.78
2.27 / 192 / 65.77 <---注意测试1,2和4,5
之间的相似性3.139 / 313 / 202.58
4。 63/342 / 150.18
5. 63/341 / 149.99
可以看出, 理解版本的顺序是for for new for each instance,这意味着实际上可以为匿名函数和非匿名函数版本执行新。
方法:将下面的代码(除去测试调用)编译成一个.jar文件,以确保所有版本运行相同的库代码。每个迭代中的每个测试都在一个新的JVM(即每个测试的scala -cp ...)中调用,以消除堆大小问题。
class t(val i:Int){
def summit(j:Int)= j + i
}
object bar {
val biglist:List [Int] =(1 to 1000000).toList
def summit(i:Int,j:Int)= i + j
// Simple for loop
def forloop:Int = {
var result:Int = 0
for(i< - biglist){
result + = i
}
结果
}
//用于函数循环而不是内联数学
def forloop2:Int = {
var result:Int = 0
for (i< - biglist){
result = summit(result,i)
}
结果
}
//用于生成循环对象PER迭代
def forloop3:Int = {
var result:Int = 0
for(i< - biglist){
val t = n ew t(result)
result = t.summit(i)
}
result
}
//带有一个匿名函数的list.reduce in
def anonymousfunc:Int = {
biglist.reduce((i,j)=> {i + j})
}
//带有命名函数的$ list.reduce
def realfunc:Int = {
biglist.reduce(summit)
}
//为简洁起见,删除了测试调用代码。举一个例子:
args(0)match {
case1=> {
val start = System.currentTimeMillis()
forloop
val end = System.currentTimeMillis()
println(for =+(end - start))
}
...
}
解决方案你们被告知理解是真的,但你们的问题是,你们把理解和匿名功能混在一起了。
为理解在Scala中是一系列
.flatMap
,.map
和.filter
应用程序。由于您正在测试简化算法,并且由于使用这三个函数实现简化算法是不可能的,所以您的测试用例是不正确的。
下面是一个for comprehension :
val listOfLists = List(List(1,2),List(3,4),List(5))
val result =
for {
itemOfListOfLists< - listOfLists
itemOfItemOfListOfLists< - itemOfListOfLists
}
yield(itemOfItemOfListOfLists + 1)
assert (result == List(2,3,4,5,6))
编译器解析理解部分如下:
val结果=
listOfLists.flatMap(
itemOfListOfLists => itemOfListOfLists .map(
itemOfItemOfListOfLists => itemOfItemOfListOfLists + 1
)
)
然后解除匿名函数的语法:
$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ listOfLists.flatMap(
$ new Function1 [List [Int],List [Int]] {
override def apply(itemOfListOfLists:List [Int]):List [Int] =
itemOfListOfLists.map(
new Function1 [Int,Int] {
override def apply(itemOfItemOfListOfLists:Int):Int =
itemOfItemOfListOfLists + 1
}
)
}
)
$ b从现在的代码可以看出,
Function1 [Int,Int]
类在每次应用(itemOfListOfLists:List [Int]):List [Int] 方法被调用时被实例化。这发生在 listOfLists
的每个条目中。所以你理解的越复杂,你得到的Function
对象的实例化越多。It has been said that Scala For comprehensions are actually quite slow. The reason I was given was that due to a Java limitation, for comprehensions (such as "reduce", used below) need to generate a temporary object with each iteration in order to call the function passed in.
IS...THIS...TRUE? The tests below seem to bear this out, but I do not completely understand why this is the case.
This might make sense for "lambdas" or anonymous functions, but not for non-anonymous functions.
In my test, I ran for loops against list.reduce (see code below), and found them to be over twice as fast, even when each iteration called the exact same function that was passed to reduce!
I find this amazingly counter-intuitive (once would think that the Scala library would've been carefully created to be as optimal as possible).
In a test I put together, I ran the same loop (sum up the numbers 1 to a million, irrespective of overflow) five different ways:
- for loop over the array of values
- for loop, but calling a function instead of inline arithmetic
- for loop, creating an object that contains an addition function
- list.reduce, passing i an anonymous function
- list.reduce, passing in an object member function
Results were as follows: test: min/max/average (milliseconds)
1. 27/157/64.78 2. 27/192/65.77 <--- note the similarity between tests 1,2 and 4,5 3. 139/313/202.58 4. 63/342/150.18 5. 63/341/149.99
As can be seen, the "for comprehension" versions are on the order of "for with new for each instance", implying that a "new" may in fact be performed for both the anonymous and non-anonymous function versions.
Methodology: code below (test call removed) was compiled into a single .jar file to ensure all versions ran the same library code. Each test in each iteration was called in a new JVM (i.e. scala -cp ... for each test) in order to remove heap size issues.
class t(val i: Int) { def summit(j: Int) = j + i } object bar { val biglist:List[Int] = (1 to 1000000).toList def summit(i: Int, j:Int) = i+j // Simple for loop def forloop: Int = { var result: Int = 0 for(i <- biglist) { result += i } result } // For loop with a function instead of inline math def forloop2: Int = { var result: Int = 0 for(i <- biglist) { result = summit(result,i) } result } // for loop with a generated object PER iteration def forloop3: Int = { var result: Int = 0 for(i <- biglist) { val t = new t(result) result = t.summit(i) } result } // list.reduce with an anonymous function passed in def anonymousfunc: Int = { biglist.reduce((i,j) => {i+j}) } // list.reduce with a named function def realfunc: Int = { biglist.reduce(summit) } // test calling code excised for brevity. One example given: args(0) match { case "1" => { val start = System.currentTimeMillis() forloop val end = System.currentTimeMillis() println("for="+(end - start)) } ... }
解决方案What you were told was true about "for comprehensions", but the problem with your question is that you've mixed up "for comprehensions" with "anonymous functions".
"For comprehension" in Scala is a syntactic sugar for a series of
.flatMap
,.map
and.filter
applications. Since you are testing reduction algorithms and since it's impossible to implement a reduction algorithm using those three functions, your test cases are incorrect.Here's an example of a "for comprehension":
val listOfLists = List(List(1,2), List(3,4), List(5)) val result = for { itemOfListOfLists <- listOfLists itemOfItemOfListOfLists <- itemOfListOfLists } yield (itemOfItemOfListOfLists + 1) assert( result == List(2,3,4,5,6) )
The compiler desugars the Comprehension part to the following:
val result = listOfLists.flatMap( itemOfListOfLists => itemOfListOfLists.map( itemOfItemOfListOfLists => itemOfItemOfListOfLists + 1 ) )
Then it desugars the Anonymous Function syntax:
val result = listOfLists.flatMap( new Function1[List[Int], List[Int]] { override def apply(itemOfListOfLists: List[Int]): List[Int] = itemOfListOfLists.map( new Function1[Int, Int] { override def apply(itemOfItemOfListOfLists: Int): Int = itemOfItemOfListOfLists + 1 } ) } )
From the desugarred code it's now evident that the
Function1[Int, Int]
class gets instantiated every time theapply(itemOfListOfLists: List[Int]): List[Int]
method gets called. That happens for every entry oflistOfLists
. So the more complex your comprehension is the more instantiations of theFunction
objects you get.这篇关于为什么是Scala“for循环理解”所以与FOR循环相比非常慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!