Scala中的并行迭代器 [英] Parallel iterator in Scala

查看:108
本文介绍了Scala中的并行迭代器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以使用Scala的并行集合并行化Iterator 而无需对其进行完全评估?

Is it somehow possible, using Scala's parallel collections to parallelize an Iterator without evaluating it completely beforehand?

在这里,我正在谈论并行化Iterator上的功能转换,即mapflatMap. 我认为这需要先评估Iterator的某些元素,然后再通过next消耗掉一些元素,然后进行更多计算.

Here I am talking about parallelizing the functional transformations on an Iterator, namely map and flatMap. I think this requires evaluating some elements of the Iterator in advance, and then computing more, once some are consumed via next.

我所能找到的所有内容最多都要求将迭代器转换为IterableStream.当我在Stream上调用.par时,它会得到完全评估.

All I could find would require the iterator to be converted to a Iterable or a Stream at best. The Stream then gets completely evaluated when I call .par on it.

我也欢迎实施提案(如果尚不可用).实现应支持并行的mapflatMap.

I also welcome implementation proposals if this is not readily available. Implementations should support parallel map and flatMap.

推荐答案

我意识到这是一个老问题,但是

I realize that this is an old question, but does the ParIterator implementation in the iterata library do what you were looking for?

scala> import com.timgroup.iterata.ParIterator.Implicits._
scala> val it = (1 to 100000).toIterator.par().map(n => (n + 1, Thread.currentThread.getId))
scala> it.map(_._2).toSet.size
res2: Int = 8 // addition was distributed over 8 threads

这篇关于Scala中的并行迭代器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆