GHC优化：Collatz猜想 [英] GHC Optimization: Collatz conjecture

查看：80 发布时间：2018/6/4 15:20:34 haskell optimization

本文介绍了GHC优化：Collatz猜想的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我编写了欧拉项目挑战赛14 的代码，在 Haskell 和 C ++ （ideone链接）。他们都记得他们之前在数组中做过的任何计算。

分别使用 ghc -O2 和 g ++ -O3 ，C ++的运行速度比Haskell版本快10-15倍。

虽然我明白Haskell版本可能运行速度较慢，而Haskell是一种更好的语言，如果知道我可以对Haskell版本进行一些代码更改以使其运行速度更快（理想情况下是C ++版本的2或3倍），那将是很好的做法？

Haskell代码在这里：

  import Data.Array 
 import Data.Word 
 import Data.List 
 
 collatz_array = 
 let 
 upperbound = 1000000 
a = array（1，upperbound）[（i :: Word64 ，fi :: Int）| i<  -  [1..upperbound]] 
 f i = i`seq` 
 let 
 check_f i = i`seq` if i <= upperbound then a！ （if == 1）then 0 else（check_f（（if（even i）then i else 3 * i + 1）`div` 2））+ 1 $ b if $ if $ 
 
 $ b main = 
 putStrLn $ show $ 
 foldl1'（\（x1，x2）（y1，y2） - > if（x2> = y2）then（x1，x2）else（y1，y2））$！ （assocs collatz_array）

编辑：

我现在也做了一个使用unboxed可变数组的版本。它仍然比C ++版本慢5倍，但是是一个显着的改进。代码是在ideone 这里。

I' d想知道对可变数组版本的改进，使其更接近C ++版本。
解决方案
数组）代码：

使用折叠来查找最大链长，因为数组必须转换为关联列表，这需要时间和C ++版本不需要的分配。

使用甚至和 div 用于测试resp除以2.这些很慢。 g ++将操作优化为更快的位操作（至少在更快的平台上），但GHC不会进行这些低级优化（暂时），所以暂时必须手工完成。
您可以使用 readArray 和 writeArray 。在C ++代码中没有完成的额外边界检查也需要时间，一旦其他问题得到处理，这相当于运行时间的很大一部分（我的框中约为25％），因为已经完成了在算法中有很多读写操作。
在执行过程中加入了这个元素，我得到了 import Data.Array.ST import Data.Array.Base import Control.Monad.ST import Data.Bits collatz_array :: ST s（STUArray s Int Int） collatz_array = do let upper = 10000000 arr < - newArray（0，upper）0 unsafeWrite arr 2 1 让我检查我 |上部<我=返回arr |我和&。 1 == 0 = do l< - unsafeRead arr（i`shiftR` 1） unsafeWrite arr i（l + 1） check（i + 1） |否则= do let j =（3 * i + 1）`shiftR` 1 find k l |上部< k = find（next k）$！ l + 1 | k< i = do m< - unsafeRead arr k return（m + 1） |否则=做 m< - unsafeRead arr k 如果m == 0 然后做 n< - find（next k）1 unsafeWrite arr kn return（n + l） else return（m + l）其中 next h | h。& ;. 1 == 0 = h`shiftR` 1 |否则=（3 * h + 1）`shiftR` 1 l < - 找到j 1 unsafeWrite arr il check（i + 1） check 3 collatz_max :: ST s（Int，Int） collatz_max = do car< - collatz_array （_，upper）< - getBounds car let find wmi |上部<我= return（w，m） |否则= do l < - unsafeRead car i if m< l 然后找到il（i + 1） else找到wm（i + 1） find 1 0 2 main :: IO（） main = print（runST collatz_max）以及时间（均为1000万）： $ time ./cccoll 8400511 429 real 0m0.210s user 0m0 .200s sys 0m0.009s $ time ./stcoll （8400511,429） real 0m0.341s user 0m0.307s sys 0m0.033s 看起来不错。

 重要说明：该代码仅适用于64位GHC（因此，特别是在Windows上，您需要ghc-7.6.1或更高版本，先前的GHC为32即使在64位Windows上也是如此），因为中间链元素超过了32位范围。在32位系统上，必须使用 Integer 或64位整数类型（ Int64 或<$因为原始的64位操作（算术和移位）被实现为对32位GHC中的C函数的外部调用，所以以严格的性能成本执行后续链（c $ c> Word64 ）  fast 外部调用，但仍然比直接机器操作慢得多）。  
I've written code for the Project Euler's Challenge 14, in both Haskell and C++ (ideone links). They both remember any calculations they have previously done in an array. 

Using ghc -O2 and g++ -O3 respectively, the C++ runs 10-15 times faster than the Haskell version.

Whilst I understand the Haskell version may run slower, and that Haskell is a nicer language to write in, it would be nice to know some code changes I can make to the Haskell version to make it run faster (ideally within a factor of 2 or 3 of the C++ version)?



Haskell code is here:
import Data.Array
import Data.Word
import Data.List

collatz_array = 
  let
    upperbound = 1000000
    a = array (1, upperbound) [(i :: Word64, f i :: Int) | i <- [1..upperbound]]
    f i = i `seq`
      let
        check_f i = i `seq` if i <= upperbound then a ! i else f i
      in
        if (i == 1) then 0 else (check_f ((if (even i) then i else 3 * i + 1) `div` 2)) + 1
  in a

main = 
  putStrLn $ show $ 
   foldl1' (\(x1,x2) (y1,y2) -> if (x2 >= y2) then (x1, x2) else (y1, y2)) $! (assocs collatz_array)




Edit:

I've now also done a version using unboxed mutable arrays. It is still 5 times slower than the C++ version, but a significant improvement. The code is on ideone here.

I'd like to know improvements to the mutable array version which bring it closer to the C++ version.
 解决方案 
Some problems with your (mutable array) code:


You use a fold to find the maximal chain length, for that the array has to be converted to an association list, that takes time and allocation the C++ version doesn't need.
You use even and div for testing resp dividing by 2. These are slow. g++ optimises both operations to the faster bit operations (on platforms where that is supposedly faster, at least), but GHC doesn't do these low-level optimisations (yet), so for the time being, they have to be done by hand.
You use readArray and writeArray. The extra bounds-checking that isn't done in the C++ code also takes time, once the other problems are dealt with, that amounts to a significant portion of the running time (ca. 25% on my box), since there are done a lot of reads and writes in the algorithm.


Incorporating that into the implementation, I get
import Data.Array.ST
import Data.Array.Base
import Control.Monad.ST
import Data.Bits

collatz_array :: ST s (STUArray s Int Int)
collatz_array = do
    let upper = 10000000
    arr <- newArray (0,upper) 0
    unsafeWrite arr 2 1
    let check i
            | upper < i = return arr
            | i .&. 1 == 0 = do
                l <- unsafeRead arr (i `shiftR` 1)
                unsafeWrite arr i (l+1)
                check (i+1)
            | otherwise = do
                let j = (3*i+1) `shiftR` 1
                    find k l
                        | upper < k = find (next k) $! l+1
                        | k < i     = do
                            m <- unsafeRead arr k
                            return (m+l)
                        | otherwise = do
                            m <- unsafeRead arr k
                            if m == 0
                              then do
                                  n <- find (next k) 1
                                  unsafeWrite arr k n
                                  return (n+l)
                              else return (m+l)
                          where
                            next h
                                | h .&. 1 == 0 = h `shiftR` 1
                                | otherwise = (3*h+1) `shiftR` 1
                l <- find j 1
                unsafeWrite arr i l
                check (i+1)
    check 3

collatz_max :: ST s (Int,Int)
collatz_max = do
    car <- collatz_array
    (_,upper) <- getBounds car
    let find w m i
            | upper < i = return (w,m)
            | otherwise = do
                l <- unsafeRead car i
                if m < l
                  then find i l (i+1)
                  else find w m (i+1)
    find 1 0 2

main :: IO ()
main = print (runST collatz_max)
And the timings (both for 10 million):
$ time ./cccoll
8400511 429

real    0m0.210s
user    0m0.200s
sys     0m0.009s
$ time ./stcoll
(8400511,429)

real    0m0.341s
user    0m0.307s
sys     0m0.033s
which doesn't look too bad.

Important note: That code only works on 64-bit GHC (so, in particular, on Windows, you need ghc-7.6.1 or later, previous GHCs were 32-bit even on 64-bit Windows) since intermediate chain elements exceed 32-bit range. On 32-bit systems, one would have to use Integer or a 64-bit integer type (Int64 or Word64) for following the chains, at a drastic performance cost, since the primitive 64-bit operations (arithmetic and shifts) are implemented as foreign calls to C functions in 32-bit GHCs (fast foreign calls, but still much slower than direct machine ops). 

                        这篇关于GHC优化：Collatz猜想的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

GHC优化：Collatz猜想 [英] GHC Optimization: Collatz conjecture

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

GHC优化：Collat​​z猜想 [英] GHC Optimization: Collatz conjecture

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

GHC优化：Collatz猜想 [英] GHC Optimization: Collatz conjecture

登录关闭