在Haskell中快速排序 [英] Fast sorting in Haskell

查看:174
本文介绍了在Haskell中快速排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读Stack Overflow问题 使用向量来改善Haskell的性能 在Haskell中放置 quicksort ,我为自己设定了两个目标:
$


  • 执行中位数为3的相同算法以避免在预先排序的向量上表现不佳;

  • 制作一个平行版本。
  • 下面是结果(为简单起见,

     将合格的Data.Vector.Unboxed.Mutable导入为MV 
    将合格的Data.Vector.Generic.Mutable导入为GM

    类型Vector = MV.IOVector Int
    类型Sort = Vector - > IO()

    medianofthreepartition :: Vector - > Int - > IO Int
    medianofthreepartition uv li = do
    p1< - MV.unsafeRead uv li
    p2< - MV.unsafeRead uv $ li`div` 2
    p3< - MV.unsafeRead uv 0
    let p = median p1 p2 p3
    GM.unstablePartition(< p)uv

    vquicksort :: Sort
    vquicksort uv = do $ (j> 1)(vquicksort(MV.unsafeSlice 0 j uv))
    when(j> 1)时,b $ b让li = MV.length uv -1 b $ bj< - medianofthreepartition uv li
    j + 1
    vparquicksort :: Sort
    vparquicksort uv = do
    让li = MV.length uv-1
    j< - 三分配的中值uv li
    t1 - tryfork(j> 1)(vparquicksort(MV.unsafeSlice 0 j uv))
    t2 < - tryfork(j + 1 等待t1
    等待t2

    tryfork :: Bool - > IO() - > IO(Maybe(MVar()))
    tryfork False _ = return Nothing
    tryfork True action = do
    done< - newEmptyMVar :: IO(MVar())
    _ < - forkFinally action(\_ - > putMVar done())
    return $刚刚完成

    wait :: Maybe(MVar()) - > IO()
    等待Nothing =返回()
    等待(完成)= swapMVar完成()

    中值:: Int - > Int - > Int - >诠释
    中位数a b c
    | a> b =
    如果b> c然后b
    否则如果a> c然后c
    else a
    |否则=
    如果a> c然后a
    否则如果b> c然后c
    else b

    对于具有1,000,000个元素的矢量,我得到以下结果:

     线程数量:4

    ****并行****
    测试按长度排序:1000000
    创建矢量
    打印矢量
    对随机矢量进行排序
    CPU时间:12.30 s
    对有序向量进行排序
    CPU时间:9.44 s

    ****单线程****
    按长度排序的测试:1000000
    创建矢量
    打印矢量
    对随机矢量进行排序
    CPU时间:0.27 s
    排序矢量
    CPU时间:0.39 s

    我的问题是:



    • 为什么使用预先排序好的矢量表现仍然在下降?

    • 为什么使用forkIO和四个线程无法提高性能?

      解决方案

      更好的办法是使用 Control.Parallel.S trategies 来并行化quicksort。采用这种方法,您不会为每个可以并行执行的代码创建昂贵的线程。你也可以创建一个纯粹的计算,而不是IO。

      然后你必须根据你拥有的核心数来编译:
      http://www.haskell.org/ghc/docs/latest/html/users_guide /using-concurrent.html



      举个例子,看看Jim Apple编写的这个简单的快速列表:

       导入Data.HashTable为H 
      导入Data.Array.IO
      导入Control.Parallel.Strategies
      导入Control.Monad
      import System

      exch air =
      do tmpi< - readArray ai
      tmpr< - readArray ar
      writeArray ai tmpr
      writeArray ai tmpi

      bool abc = if c then a else b

      quicksort arr lr =
      if r <= l then return()else do
      i-loop(1-1)r =<< readArray arr r
      exch arr ir
      withStrategy rpar $ quicksort arr l(i-1)
      quicksort arr(i + 1)r
      其中
      loop ijv = do $()()()()()()()()() -1))
      if(i'< j')then exch arr i'j'>>循环我'j'v
      else返回我'
      找到p f i =如果我== l然后返回我
      else bool(返回i)(找到p f(f i))。 p =<< readArray arr i
      $ b main =
      do [testSize]< - fmap(fmap read)getArgs
      arr< - testPar testSize
      ans< - readArray arr (testSize`div` 2)
      print ans

      testPar testSize =
      do x < - testArray testSize
      quicksort x 0(testSize - 1)
      返回x

      testArray :: Int - > IO(IOArray Int Double)
      testArray testSize =
      do ans< - newListArray(0,testSize-1)[fromIntegral $ H.hashString $ show i | i< - [1..testSize]]
      return ans


      After reading Stack Overflow question Using vectors for performance improvement in Haskell describing a fast in-place quicksort in Haskell, I set myself two goals:

      • Implementing the same algorithm with a median of three to avoid bad performances on pre-sorted vectors;

      • Making a parallel version.

      Here is the result (some minor pieces have been left for simplicity):

      import qualified Data.Vector.Unboxed.Mutable as MV
      import qualified Data.Vector.Generic.Mutable as GM
      
      type Vector = MV.IOVector Int
      type Sort = Vector -> IO ()
      
      medianofthreepartition :: Vector -> Int -> IO Int
      medianofthreepartition uv li = do
          p1 <- MV.unsafeRead uv li
          p2 <- MV.unsafeRead uv $ li `div` 2
          p3 <- MV.unsafeRead uv 0
          let p = median p1 p2 p3
          GM.unstablePartition (< p) uv
      
      vquicksort :: Sort
      vquicksort uv = do
          let li = MV.length uv - 1
          j <- medianofthreepartition uv li
          when (j > 1) (vquicksort (MV.unsafeSlice 0 j uv))
          when (j + 1 < li) (vquicksort (MV.unsafeSlice (j+1) (li-j) uv))
      
      vparquicksort :: Sort
      vparquicksort uv = do
          let li = MV.length uv - 1
          j <- medianofthreepartition uv li
          t1 <- tryfork (j > 1) (vparquicksort (MV.unsafeSlice 0 j uv))
          t2 <- tryfork (j + 1 < li) (vparquicksort (MV.unsafeSlice (j+1) (li-j) uv))
          wait t1
          wait t2
      
      tryfork :: Bool -> IO () -> IO (Maybe (MVar ()))
      tryfork False _ = return Nothing
      tryfork True action = do
        done <- newEmptyMVar :: IO (MVar ())
        _ <- forkFinally action (\_ -> putMVar done ())
        return $ Just done
      
      wait :: Maybe (MVar ()) -> IO ()
      wait Nothing = return ()
      wait (Just done) = swapMVar done ()
      
      median :: Int -> Int -> Int -> Int
      median a b c
              | a > b =
                      if b > c then b
                              else if a > c then c
                                      else a
              | otherwise =
                      if a > c then a
                              else if b > c then c
                                      else b
      

      For vectors with 1,000,000 elements, I get the following results:

      "Number of threads: 4"
      
      "**** Parallel ****"
      "Testing sort with length: 1000000"
      "Creating vector"
      "Printing vector"
      "Sorting random vector"
      CPU time:  12.30 s
      "Sorting ordered vector"
      CPU time:   9.44 s
      
      "**** Single thread ****"
      "Testing sort with length: 1000000"
      "Creating vector"
      "Printing vector"
      "Sorting random vector"
      CPU time:   0.27 s
      "Sorting ordered vector"
      CPU time:   0.39 s
      

      My questions are:

      • Why are performances still decreasing with a pre-sorted vector?
      • Why does using forkIO and four thread fails to improve performances?

      解决方案

      A better idea is to use Control.Parallel.Strategies to parallelize quicksort. With this approach you will not create expensive threads for every code that can be executed in parallel. You can also create a pure computation instead an IO.

      Then you have to compile according to the number of cores you have: http://www.haskell.org/ghc/docs/latest/html/users_guide/using-concurrent.html

      For an example, look at this simple quicksort on lists, written by Jim Apple:

      import Data.HashTable as H
      import Data.Array.IO
      import Control.Parallel.Strategies
      import Control.Monad
      import System
      
      exch a i r =
          do tmpi <- readArray a i
             tmpr <- readArray a r
             writeArray a i tmpr
             writeArray a i tmpi
      
      bool a b c = if c then a else b
      
      quicksort arr l r =
        if r <= l then return () else do
          i <- loop (l-1) r =<< readArray arr r
          exch arr i r
          withStrategy rpar $ quicksort arr l (i-1)
          quicksort arr (i+1) r
        where
          loop i j v = do
            (i', j') <- liftM2 (,) (find (>=v) (+1) (i+1)) (find (<=v) (subtract 1) (j-1))
            if (i' < j') then exch arr i' j' >> loop i' j' v
                         else return i'
          find p f i = if i == l then return i
                       else bool (return i) (find p f (f i)) . p =<< readArray arr i
      
      main = 
          do [testSize] <- fmap (fmap read) getArgs
             arr <- testPar testSize
             ans <- readArray arr  (testSize `div` 2)
             print ans
      
      testPar testSize =
          do x <- testArray testSize
             quicksort x 0 (testSize - 1)
             return x
      
      testArray :: Int -> IO (IOArray Int Double)
      testArray testSize = 
          do ans <- newListArray (0,testSize-1) [fromIntegral $ H.hashString $ show i | i <- [1..testSize]]
             return ans
      

      这篇关于在Haskell中快速排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆