scala:并行集合不起作用? [英] scala: parallel collections not working?

查看:37
本文介绍了scala:并行集合不起作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过 .par 以一种非常基本的方式使用并行集合 - 我希望集合会被乱序处理,但情况似乎并非如此:

i'm trying to usage parallel collections in a very basic way via .par - i expect the collection to be acted on out of order, but that doesn't seem the case:

scala> (1 to 10) map println
1
2
3
4
5
6
7
8
9
10

scala> (1 to 10).par map println
1
2
3
4
5
6
7
8
9
10

似乎在后一种情况下顺序不应该是连续的.这是 Scala 2.9,我的机器有 2 个内核.这可能是某个地方的错误配置吗?谢谢!

seems like the order shouldn't be sequential in the latter case. this is with scala 2.9, my machine has 2 cores. is this perhaps a misconfiguration somewhere? thanks!

我确实尝试使用大集合(100k)运行,结果仍然是连续的.

edit: i did indeed try running with a large set (100k) and the result was still sequential.

推荐答案

YMMV:

scala> (1 to 10).par map println
1
6
2
3
4
7
5
8
9

这也是双核...

我认为,如果您尝试足够多的跑步,您可能会看到不同的结果.这是一段代码,显示了发生的一些事情:

I think if you try enough run you may see different results. Here is a piece of code that shows some of what happens:

import collection.parallel._
import collection.parallel.immutable._

class ParRangeEx(range: Range) extends ParRange(range) {
  // Some minimal number of elements after which this collection 
  // should be handled sequentially by different processors.
  override def threshold(sz: Int, p:Int) = {
    val res = super.threshold(sz, p)
    printf("threshold(%d, %d) returned %d\n", sz, p, res)
    res
  }
  override def splitter = {
    new ParRangeIterator(range) 
        with SignalContextPassingIterator[ParRangeIterator] {
      override def split: Seq[ParRangeIterator] = {
        val res = super.split
        println("split " + res) // probably doesn't show further splits
        res
      }
    }
  }
}

new ParRangeEx((1 to 10)).par map println

有些运行我得到了穿插处理,有些运行我得到了顺序处理.它似乎将负载分成两部分.如果您将返回的阈值数更改为 11,您将看到工作负载永远不会被拆分.

Some runs I get interspersed processing, some runs I get sequential processing. It seems to split the load in two. If you change the returned threshold number to 11, you'll see that the workload will never be split.

底层调度机制基于fork-join和工作窃取.参见下面的JSR166一些见解的源代码.这可能决定了是同一个线程同时处理两个任务(因此看起来是顺序的)还是两个线程处理每个任务.

The underlying scheduling mechanism is based on fork-join and work stealing. See the following JSR166 source code for some insights. This is probably what drives whether the same thread will pick up both tasks (and thus seems sequential) or two threads work on each task.

这是我电脑上的输出示例:

Here is an example output on my computer:

threshold(10, 2) returned 1
split List(ParRangeIterator(over: Range(1, 2, 3, 4, 5)), 
  ParRangeIterator(over: Range(6, 7, 8, 9, 10)))
threshold(10, 2) returned 1
threshold(10, 2) returned 1
threshold(10, 2) returned 1
threshold(10, 2) returned 1
threshold(10, 2) returned 1
6
7
threshold(10, 2) returned 1
8
1
9
2
10
3
4
5

这篇关于scala:并行集合不起作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆