是否有可能在Haskell中以par的速度加快速度? [英] Is it possible to speed up a quicksort with par in Haskell?

查看:149
本文介绍了是否有可能在Haskell中以par的速度加快速度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  import System.Random 
import Control.Parallel
import Data.List

quicksort :: [a] - > [a]
quicksort xs = pQuicksort 16 xs - 16是用于分类的火花数量

- pQuicksort,parallelQuicksort
- 只要n> 0并行计算列表的上部和下部,
- 当我们递归到足够深的时候,n == 0,这会变成一个串行快速排序。
pQuicksort :: Int - > [a] - > [p]
pQuicksort _ [] = []
pQuicksort 0(x:xs)=
在pQuicksort中设置(lower,upper)= partition(< x)xs
0低位++ [x] ++ pquicksort 0高位
pQuicksort n(x:xs)=
let(lower,upper)=分区(< x)xs
l = pQuicksort(n `div` 2)lower
u = [x] ++ pQuicksort(n`div` 2)upper
in(par ul)++ u

main :: IO( )
main = do
gen< - getStdGen
let randints =(take 5000000)$ randoms gen :: [Int]
putStrLn。显示 。 sum $(quicksort randints)

我用
编译

  ghc --make -threaded -O2 quicksort.hs 



<

  ./ quicksort + RTS -N16 -RTS 

无论我做什么,我都无法让它比运行在一个cpu上的简单顺序实现运行得更快。


  1. 是否有可能解释为什么它在多个CPU上的运行速度比在一个CPU上慢得多?

  2. 是否有可能使此比例最少子线性,与CPU的数量通过做一些技巧?

编辑:@tempestadept暗示,快速排序它自我是问题。为了检查这一点,我以与上例相同的精神实施了一个简单的合并排序。它具有相同的行为,执行速度越慢,添加的功能越多。

  import System.Random 
import Control.Parallel

splitList :: [a] - > ([a],[a])
splitList = helper True [] []
helper _ left right [] =(left,right)$ b $ helper true right left(x:xs )= helper False(x:left)right xs
helper False left right(x:xs)= helper True left(x:right)xs

merge ::(Ord a)= > [a] - > [a] - > [a]
合并xs [] = xs
merge [] ys = ys
merge(x:xs)(y:ys)= case x True - > x:合并xs(y:ys)
False - > y:merge(x:xs)ys

mergeSort ::(Ord a)=> [a] - > [a]
mergeSort xs = pMergeSort 16 xs - 我们使用16个sparks

- pMergeSort,并行合并排序。需要额外的参数
- 告诉要创建多少个火花。在我们的简单测试中,它是
- 设置为16
pMergeSort ::(Ord a)=> Int - > [a] - > [a]
pMergeSort _ [] = []
pMergeSort _ [a] = [a]
pMergeSort 0 xs =
let(left,right)= splitList xs
in merge(pMergeSort 0 left)(pMergeSort 0 right)
pMergeSort n xs =
let(left,right)= splitList xs
l = pMergeSort(n`div` 2)left
r = pMergeSort(n`div` 2)右边
in(r`par` l)`pseq`(合并lr)

ris :: Int - > IO [Int]
ris n = do
gen< - getStdGen
return。 (取n)$ randoms gen

main = do
r < - ris 100000
putStrLn。显示 。 sum $ mergeSort r


解决方案

我不太确定它可以用于惯用的快速排序,但它可以用于(在某种程度上较弱的情况下)真正必要的快速排序,如Roman在 Sparking Imperatives

尽管如此,他从未获得过很好的加速。这确实需要真正的工作 - 窃取deque ,不会像火花队列那样溢出来正确优化。

I have got this seemingly trivial parallel quicksort implementation, the code is as follows:

import System.Random
import Control.Parallel
import Data.List

quicksort :: [a] -> [a]
quicksort xs = pQuicksort 16 xs -- 16 is the number of sparks used to sort

-- pQuicksort, parallelQuicksort  
-- As long as n > 0 evaluates the lower and upper part of the list in parallel,
-- when we have recursed deep enough, n==0, this turns into a serial quicksort.
pQuicksort :: Int -> [a] -> [a]
pQuicksort _ [] = []
pQuicksort 0 (x:xs) =
  let (lower, upper) = partition (< x) xs
  in pQuicksort 0 lower ++ [x] ++ pQuicksort 0 upper
pQuicksort n (x:xs) =
  let (lower, upper) = partition (< x) xs
      l = pQuicksort (n `div` 2) lower
      u = [x] ++ pQuicksort (n `div` 2) upper
  in (par u l) ++ u

main :: IO ()
main = do
  gen <- getStdGen
  let randints = (take 5000000) $ randoms gen :: [Int]
  putStrLn . show . sum $ (quicksort randints)

I compile with

ghc --make -threaded -O2 quicksort.hs

and run with

./quicksort +RTS -N16 -RTS

No matter what I do I can not get this to run faster than a simple sequential implementation running on one cpu.

  1. Is it possible to explain why this runs so much slower on several CPUs than on one?
  2. Is it possible to make this scale, at least sub linearly, with the number of CPUs by doing some trick?

EDIT: @tempestadept hinted that quick sort it self is the problem. To check this I implemented a simple merge sort in the same spirit as the example above. It has the same behaviour, performs slower the more capabilities you add.

import System.Random
import Control.Parallel

splitList :: [a] -> ([a], [a])
splitList = helper True [] []
  where helper _ left right [] = (left, right)
        helper True  left right (x:xs) = helper False (x:left) right xs
        helper False left right (x:xs) = helper True  left (x:right) xs

merge :: (Ord a) => [a] -> [a] -> [a]
merge xs [] = xs
merge [] ys = ys
merge (x:xs) (y:ys) = case x<y of
  True  -> x : merge xs (y:ys)
  False -> y : merge (x:xs) ys

mergeSort :: (Ord a) => [a] -> [a]
mergeSort xs = pMergeSort 16 xs -- we use 16 sparks

-- pMergeSort, parallel merge sort. Takes an extra argument
-- telling how many sparks to create. In our simple test it is
-- set to 16
pMergeSort :: (Ord a) => Int -> [a] -> [a]
pMergeSort _ [] = []
pMergeSort _ [a] = [a]
pMergeSort 0 xs =
  let (left, right) = splitList xs
  in  merge (pMergeSort 0 left) (pMergeSort 0 right)
pMergeSort n xs =
  let (left, right) = splitList xs
      l = pMergeSort (n `div` 2) left
      r = pMergeSort (n `div` 2) right
  in  (r `par` l) `pseq` (merge l r)

ris :: Int -> IO [Int]
ris n = do
  gen <- getStdGen
  return . (take n) $ randoms gen

main = do
  r <- ris 100000
  putStrLn . show . sum $ mergeSort r

解决方案

I'm not sure how well it can work for the idiomatic quicksort, but it can work (to a somewhat weak extent) for the true imperative quicksort as shown by Roman in Sparking Imperatives.

He never did get good speedup, though. This really needs a real work-stealing deque that doesn't overflow like the spark queue to optimize properly.

这篇关于是否有可能在Haskell中以par的速度加快速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆