欧拉项目#14 Haskell技巧? [英] Project Euler #14 Tips in Haskell?

查看:107
本文介绍了欧拉项目#14 Haskell技巧?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试欧洲挑战赛14 。我想知道是否可以在Haskell中快速计算任何提示。我试过这种天真的做法。


import Data.List
import Data.Function
collat​​z n |甚至n = n quot 2
|否则= 3 * n + 1

colSeq = takeWhile(/ = 1)。 (iterate collat​​z)

main = print $ maximumBy(比较 on (length。colSeq))[1..999999]




但这花了太长时间。

 λ< * Main System.Timeout>:timeout(10 ^ 6 * 60)main 
Nothing

我也尝试过使用reverse collat​​z关系,并在地图中保留长度以消除多余的计算,但这并不起作用无论是。不希望解决方案,但有人有一些数学文献,或编程技术,这将使这更快,或者我只是不得不离开它过夜吗?

解决方案

你的程序并不像你想象的那么慢...



首先,你的程序运行良好,如果使用 -O2 进行编译并增加堆栈大小(我使用 + RTS -K100m ),但系统可能会分钟):

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ \collat​​z.exe + RTS -K100m -s
在堆中分配的65,565,993,768字节
在GC
期间复制的16,662,910,752字节最大居民地址(1129个样本)
5,199,140字节最大地址
184 MB总内存使用量(由于分段而丢失0 MB)

总时间(已用)平均暂停最大值暂停
Gen 0 124724 colls,0 par 18.41s 18.19s 0.0001s 0.0032s
Gen 1 1129 colls,0 par 16.67s 16.34s 0.0145s 0.1158s

初始时间0.00s(已过去0.00s)
MUT时间39.98s(已经过了41.17s )
GC时间35.08s(已过34.52s)
出口时间0.00s(已过去0.00s)
总时间75.06s(已过去75.69s)

%GC时间46.7%(经过45.6%)

分配给每个MUT的1,639,790,387个字节第二个

生产力总用户的53.3%,已用完总数的52.8%



...但仍然很慢



〜50%的生产率意味着GC正在使用我们盯着屏幕的一半时间,等待我们的结果。在我们的例子中,我们通过迭代每个值的序列来创建很多垃圾。



改进



Collat​​z序列是一个递归序列。因此,我们应该将其定义为递归序列而不是迭代序列,并查看发生了什么。

  colSeq 1 = [1] 
colSeq n
|甚至n = n:colSeq(n`div` 2)
|否则= n:colSeq(3 * n + 1)

Haskell中的列表是一个基本类型,所以GHC应该有一些漂亮的优化( -O2 )。所以让我们试试这个:

结果



  $ .\ collat​​z_rec.exe + RTS -s 
在堆中分配的37,491,417,368字节
在GC
中复制的4,288,084字节最大居民身份(2个样本)
复制19,580字节最大值
使用的总内存1 MB(由于碎片而丢失0 MB)

总时间(已用)平均暂停最大暂停
Gen 0 72068 colls,0 par 0.22s 0.22s 0.0000 s 0.0001s
Gen 1 2 colls,0 par 0.00s 0.00s 0.0001s 0.0001s

初始时间0.00s(已过去0.00s)
MUT时间32.89s(33.12s经过)
GC时间0.22s(经过0.22s)
EXIT时间0.00s(经过0.00s)
总时间33.11s(经过33.33s)

% GC时间0.7%(已过去0.7%)

分配给每个MUT的1,139,881,573个字节第二个

生产力总用户的99.3%,已用完总数的98.7%

请注意,我们现在在〜80%MUT时间内生产率达到99%(与原始版本相比)。



等等,还有更多!



有一件事是很奇怪。为什么我们要计算 1024和512的长度?毕竟,后者不能创建更长的Collat​​z序列。

改进



然而,在这种情况下,我们必须将问题视为一项重大任务,而不是地图。我们需要跟踪我们已经计算出的值,并且我们要清除我们已经访问过的那些值。



我们使用 Data.Set 为此:

  problem_14 :: S.Set Integer  - > [(整数,整数)] 
problem_14 s
| S.null s = []
| (c,fromIntegral $ length csq):problem_14 rest
其中S.fromList csq

我们使用 problem_14 如下所示:

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ ]



结果



  $ .\collat​​z_set.exe + RTS -s 
在堆中分配的18,405,282,060字节
在GC
中复制的1,645,842,328字节最大居民地址(40个样本)27,446,972字节
373,056字节最大污水量
79 MB使用的总内存(由于分段造成的0 MB丢失)

总时间(已用)平均暂停最大值暂停
Gen 0 35193 0 par 2.17s 2.03s 0.0001s 0.0002s
Gen 1 40 colls,0 par 0.84s 0.77 s 0.0194s 0.0468s

初始时间0.00s(已过0.00s)
MUT时间14.91s(已过15.17s)
GC时间3.02s(已过时2.81s)
EXIT时间0.00s(已过去0.00s)
总时间17.92s(已过17.98s)

%GC时间16.8%(已过时15.6%)

分配给每MUT的1,234,735,903个字节第二个

生产力总用户的83.2%,占总花费的82.9%

我们放松了一些生产力,但这是合理的。毕竟,我们现在使用 Set 而不是列表,并使用79MB而不是1MB。然而,我们的程序现在运行时间是17s而不是34s,这只是原来时间的25%。



使用 ST



灵感(C ++)



  int main(){
std :: vector< bool> Q(百万,TRUE);

unsigned long long max_l = 0,max_c = 1;
$ b $ for(unsigned long i = 1; i< Q.size(); ++ i){
if(!Q [i])
continue;
unsigned long long c = i,l = 0; (c!= 1){
if(c c = c%2 == 0? c / 2:3 * c + 1;
l ++;
}
if(l> max_l){
max_l = 1;
max_c = i;
}
}
std :: cout<< max_c<<的std :: ENDL;
}

这个程序在130ms内运行。我们最好的版本需要100倍以上。
$ b

Haskell



  problem_14_vector_st :: Int  - > (Int,Int)
problem_14_vector_st limit =
runST $ do
q< - V.replicate(limit + 1)True
best < - newSTRef(1,1)
forM_ [1..limit] $ \i - >做
b< - V. read qi
当b $ do
让csq = colSeq $ fromIntegral i
let l = fromIntegral $ length csq
forM_(map fromIntegral csq)$ \j-> (l>& j> = 0)时的
$ V $写入qj假
m< -fmap snd $ readSTRef best
当(l> m)$ writeSTRef best(i,l)
readSTRef best



结果



  $ collat​​z_vector_st.exe + RTS -s 
堆中分配的2,762,282,216字节
GC
中复制的10,021,016字节最大值1,026,580字节(2个样本)
21,684字节最大污水量
2 MB使用的总内存量(0 MB由于分段丢失)

总时间(已用)平均暂停最大值暂停
Gen 0 5286 colls,0 par 0.02s 0.02s 0.0000s 0.0000s
Gen 1 2 colls,0 par 0.00s 0.00s 0.0001s 0.0001s

初始时间0.00秒(经过0.00秒)
MUT时间3.09秒(经过3.08秒)
GC时间0.02秒(经过0.02秒)
退出时间0.00s(经过0.00s)
总时间3.11s(已过3.11s)

%GC时间0.5%(已过0.7%)

分配率892,858,898字节每MUT第二个

生产力总用户的99.5%,已用完总数的99.6%



< 〜> 3秒。其他人可能会知道更多的技巧,但这是我可以从Haskell中榨取的最多。


I am trying euler challenge 14. I was wondering if I could have any tips for calculating it quickly in haskell. I tried this naive approach.

import Data.List import Data.Function collatz n | even n = n quot 2 | otherwise = 3*n+1 colSeq = takeWhile (/= 1) . (iterate collatz) main=print $ maximumBy (compare on (length . colSeq)) [1..999999]

But that took too long.

λ <*Main System.Timeout>: timeout (10^6*60) main
Nothing

I also tried using the reverse collatz relation, and keeping the lengths in a map to eliminate redundant calculations, but that didn't work either. And don't want the solution, but does anyone have some mathematical literature, or programming technique that will make this quicker, or do I just have to leave it over night?

解决方案

Your program is not as slow as you might think…

First of all, your program runs fine and finishes in under two minutes if you compile with -O2 and increase the stack size (I used +RTS -K100m, but your system might vary):

$ .\collatz.exe +RTS -K100m -s
  65,565,993,768 bytes allocated in the heap
  16,662,910,752 bytes copied during GC
      77,042,796 bytes maximum residency (1129 sample(s))
       5,199,140 bytes maximum slop
             184 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     124724 colls,     0 par   18.41s   18.19s     0.0001s    0.0032s
  Gen  1      1129 colls,     0 par   16.67s   16.34s     0.0145s    0.1158s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   39.98s  ( 41.17s elapsed)
  GC      time   35.08s  ( 34.52s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   75.06s  ( 75.69s elapsed)

  %GC     time      46.7%  (45.6% elapsed)

  Alloc rate    1,639,790,387 bytes per MUT second

  Productivity  53.3% of total user, 52.8% of total elapsed

…but that's still slow

Productivity of ~50% percent means that the GC is using half the time we're staring at the screen, waiting for our result. In our case we create to much garbage by iterating the sequence for every value.

Improvements

The Collatz sequence is a recursive sequence. Therefore, we should define it as a recursive sequence instead of a iterative one and have a look at what happens.

colSeq 1 = [1]
colSeq n 
  | even n    = n : colSeq (n `div` 2) 
  | otherwise = n : colSeq (3 * n + 1)

The list in Haskell is a fundamental type, so GHC should have some nifty optimization (-O2). So lets try this one:

Result

$ .\collatz_rec.exe +RTS -s
  37,491,417,368 bytes allocated in the heap
       4,288,084 bytes copied during GC
          41,860 bytes maximum residency (2 sample(s))
          19,580 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     72068 colls,     0 par    0.22s    0.22s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   32.89s  ( 33.12s elapsed)
  GC      time    0.22s  (  0.22s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   33.11s  ( 33.33s elapsed)

  %GC     time       0.7%  (0.7% elapsed)

  Alloc rate    1,139,881,573 bytes per MUT second

  Productivity  99.3% of total user, 98.7% of total elapsed

Note that we're now up to 99% productivity in ~80% MUT time (compared to the original version). Just by this small change we decreased the runtime tremendously.

Wait, there's more!

There's a thing that's rather strange. Why are we calculating the length of both 1024 and 512? After all, the later cannot create a longer Collatz sequence.

Improvements

However, in this case we must see the problem as one big task, and not a map. We need to keep track of the values we already calculated, and we want to clear those values we already visited.

We use Data.Set for this:

problem_14 :: S.Set Integer -> [(Integer, Integer)]
problem_14 s
  | S.null s  = []
  | otherwise = (c, fromIntegral $ length csq) : problem_14 rest
  where (c, rest') = S.deleteFindMin s
        csq        = colSeq c
        rest       = rest' `S.difference` S.fromList csq

And we use problem_14 like that:

main = print $ maximumBy (compare `on` snd) $ problem_14 $ S.fromList [1..999999]

Result

$ .\collatz_set.exe +RTS -s
  18,405,282,060 bytes allocated in the heap
   1,645,842,328 bytes copied during GC
      27,446,972 bytes maximum residency (40 sample(s))
         373,056 bytes maximum slop
              79 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     35193 colls,     0 par    2.17s    2.03s     0.0001s    0.0002s
  Gen  1        40 colls,     0 par    0.84s    0.77s     0.0194s    0.0468s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   14.91s  ( 15.17s elapsed)
  GC      time    3.02s  (  2.81s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   17.92s  ( 17.98s elapsed)

  %GC     time      16.8%  (15.6% elapsed)

  Alloc rate    1,234,735,903 bytes per MUT second

  Productivity  83.2% of total user, 82.9% of total elapsed

We loose some productivity, but that's reasonable. After all, we're now using Set and not the list anymore and use 79MB instead of 1MB. However, our program now runs in 17s instead of 34s, that's only 25% of the original time.

Using ST

Inspiration (C++)

int main(){
  std::vector<bool> Q(1000000,true);

  unsigned long long max_l = 0, max_c = 1;

  for(unsigned long i = 1; i < Q.size(); ++i){
    if(!Q[i])
      continue;
    unsigned long long c = i, l = 0;
    while(c != 1){
      if(c < Q.size()) Q[c] = false;
      c = c % 2 == 0 ? c / 2 : 3 * c + 1;
      l++;
    }
    if(l > max_l){
      max_l = l;
      max_c = i;
    }
  }
  std::cout << max_c << std::endl;
}

This program runs in 130ms. Our yet best version needs 100 times more. We can fix that.

Haskell

problem_14_vector_st :: Int -> (Int, Int)
problem_14_vector_st limit = 
  runST $ do
    q <- V.replicate (limit+1) True    
    best <- newSTRef (1,1)
    forM_ [1..limit] $ \i -> do
      b <- V.read q i
      when b $ do
        let csq = colSeq $ fromIntegral i
        let l   = fromIntegral $ length csq
        forM_ (map fromIntegral csq) $ \j-> 
          when (j<= limit && j>= 0) $  V.write q j False
        m <- fmap snd $ readSTRef best
        when (l > m) $ writeSTRef best (i,l)
    readSTRef best

Result

$ collatz_vector_st.exe +RTS -s
   2,762,282,216 bytes allocated in the heap
      10,021,016 bytes copied during GC
       1,026,580 bytes maximum residency (2 sample(s))
          21,684 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      5286 colls,     0 par    0.02s    0.02s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    3.09s  (  3.08s elapsed)
  GC      time    0.02s  (  0.02s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    3.11s  (  3.11s elapsed)

  %GC     time       0.5%  (0.7% elapsed)

  Alloc rate    892,858,898 bytes per MUT second

  Productivity  99.5% of total user, 99.6% of total elapsed

~3 seconds. Someone else might know more tricks, but that's the most I could squeeze out of Haskell.

这篇关于欧拉项目#14 Haskell技巧?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆