使用java.util.Arrays和scala.concurrent.ops.par进行Scala并行排序 [英] Scala Parallel Sort Using java.util.Arrays and scala.concurrent.ops.par

查看:94
本文介绍了使用java.util.Arrays和scala.concurrent.ops.par进行Scala并行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为我的某些代码实现错误.我不知道为什么我的排序(使用arrays.sort)在并行"版本中比在非并行版本中花费更长的时间(显然是将两个数组放在一起,但是我不认为这会增加还有更多时间).如果有人可以指出我犯的任何错误或与非并行版本相比改进并行版本的任何技巧,我将不胜感激.我是否可以使阵列更有效地合并,甚至可以并行执行?如果是这样,实施的最佳实践是什么.任何帮助将不胜感激.

I think I have implemented some of my code wrong. I cannot figure out why my sort (using arrays.sort) is taking longer in the "parallel" version than in the non-parallel version (it's obviously in putting the two arrays back together, but I didn't think it would add that much more time on). If someone could point out any mistakes that I am making or any tips to improve the parallel version over the non-parallel version I would appreciate it. Am I able to do the array merge more efficiently, or maybe even in parallel? If so, what is the best practice for implementation. Any help would be greatly appreciated.

import java.util.Arrays
import scala.concurrent._
import scala.collection._

trait Sorts {
  def doSort(a: Array[Double]): Array[Double]
}

object Simple extends Sorts {
  def doSort(a: Array[Double]) = {
    Arrays.sort(a)
    a
  }
}

object Parallel extends Sorts {
  def doSort(a: Array[Double]) = {
    val newArray = new Array[Double](a.length)
    val aLength = (a.length)
    val aSplit = ((a.length / 2).floor).toInt
    ops.par(Arrays.sort(a, 0, aSplit), Arrays.sort(a, (aSplit + 1), aLength))
    def merge(w: Int, x: Int, y: Int) {
      var i = w
      var j = x
      var k = y
      while (i <= aSplit && j <= aLength) {
        if (a(i) <= a(j)) {
          newArray(k) = a(i)
          i = i + 1
        } else {
          newArray(k) = a(j)
          j = j + 1
        }
        k = k + 1
      }
      if (i < aSplit) {
        for (i <- i until aSplit) {
          newArray(k) = a(i)
          k = k + 1
        }
      } else {
        for (j <- j until aLength) {
          newArray(k) = a(j)
          k = k + 1
        }
      }
    }
    merge(0, (aSplit + 1), 0)
    newArray
  }
}

object Main {
  def main(args: Array[String]): Unit = {
    val simpleNumbers = Array.fill(10000)(math.random)
    println(simpleNumbers.toList + "\n")
    val simpleStart = System.nanoTime()
    Simple.doSort(simpleNumbers)
    val simpleEnd = System.nanoTime()
    println(simpleNumbers.toList + "\n")
    val simpleDifference = ((simpleEnd - simpleStart) / 1e9).toDouble

    val parallelNumbers = Array.fill(10000)(math.random)
    println(parallelNumbers.toList + "\n")
    val parallelStart = System.nanoTime()
    Parallel.doSort(parallelNumbers)
    val parellelEnd = System.nanoTime()
    println(parallelNumbers.toList + "\n")
    val parallelDifference = ((parellelEnd - parallelStart) / 1e9).toDouble

    println("\n Simple Time Taken: " + simpleDifference + "\n")
    println("\n Parallel Time Taken: " + parallelDifference + "\n")

  }
}

在Intel Core i7上的输出:

Output on an Intel Core i7:

Simple Time Taken: 0.01314
Parallel Time Taken: 0.05882

推荐答案

我认为您在这里遇到了许多不同的事情.首先,在我的系统上,ops.par(Arrays.sort(...))本身比所有Simple.doSort()花费的时间更长.因此,必须有一些开销(线程创建?)主导着较小阵列的性能提升.尝试使用100,000或100万个元素.其次,Arrays.sort是就地排序,因此不必为结果创建新的10k元素数组而产生成本.

I think you've got a couple of different things going on here. First, on my system the ops.par(Arrays.sort(...)) line by itself takes longer than all of Simple.doSort(). So there must be some overhead (thread creation?) that dominates the performance gain for a smallish array. Try it for 100,000 or a million elements. Second, Arrays.sort is an in-place sort, so it doesn't have to incur the cost of creating a new 10k element array for the results.

为避免创建第二个数组,您可以先进行分区,然后将两个半部分并行排序,如

To avoid creating the second array, you can do the partition first and then sort the two halves in parallel, as recommended here

def doSort(a: Array[Double]) = {
  val pivot = a(a.length-1)
  var i = 0
  var j = a.length-2
  def swap(i: Int, j: Int) {
    val temp = a(i)
    a(i) = a(j)
    a(j) = temp
  }
  while(i < j-1) {
   if(a(i) <= pivot) {
    i+=1
   }
   else {
    swap(i,j)
    j-=1
   }
  }
  swap(j-1, a.length-1)
  ops.par(Arrays.sort(a,0,a.length/2), Arrays.sort(a,a.length/2+1,a.length))
  a
}

将阵列大小增加到100k之后,我确实看到并行版本在Intel E5300 CPU上的运行速度快两倍.

After upping the array size to 100k, I do see the parallel version performing around twice as fast on an Intel E5300 CPU.

这篇关于使用java.util.Arrays和scala.concurrent.ops.par进行Scala并行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆