Scala中的并行迭代器 [英] Parallel iterator in Scala
问题描述
是否可以使用Scala的并行集合并行化Iterator
而无需对其进行完全评估?
Is it somehow possible, using Scala's parallel collections to parallelize an Iterator
without evaluating it completely beforehand?
在这里,我正在谈论并行化Iterator
上的功能转换,即map
和flatMap
.
我认为这需要先评估Iterator
的某些元素,然后再通过next
消耗掉一些元素,然后进行更多计算.
Here I am talking about parallelizing the functional transformations on an Iterator
, namely map
and flatMap
.
I think this requires evaluating some elements of the Iterator
in advance, and then computing more, once some are consumed via next
.
我所能找到的所有内容最多都要求将迭代器转换为Iterable
或Stream
.当我在Stream
上调用.par
时,它会得到完全评估.
All I could find would require the iterator to be converted to a Iterable
or a Stream
at best. The Stream
then gets completely evaluated when I call .par
on it.
我也欢迎实施提案(如果尚不可用).实现应支持并行的map
和flatMap
.
I also welcome implementation proposals if this is not readily available. Implementations should support parallel map
and flatMap
.
推荐答案
I realize that this is an old question, but does the ParIterator
implementation in the iterata library do what you were looking for?
scala> import com.timgroup.iterata.ParIterator.Implicits._
scala> val it = (1 to 100000).toIterator.par().map(n => (n + 1, Thread.currentThread.getId))
scala> it.map(_._2).toSet.size
res2: Int = 8 // addition was distributed over 8 threads
这篇关于Scala中的并行迭代器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!