Haskell中的懒惰和尾递归,为什么会崩溃? [英] Laziness and tail recursion in Haskell, why is this crashing?

查看:114
本文介绍了Haskell中的懒惰和尾递归,为什么会崩溃?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个相当简单的函数来计算一个大列表的元素的平均值,使用两个累加器来保存迄今为止的总数和到目前为止的数量:

 mean = go 0 0 
其中
go sl [] = s / fromIntegral l
go sl(x:xs)= go(s + x) (l + 1)xs

main = do
putStrLn(show(mean [0..10000000]))

现在,用严格的语言来说,这将是尾递归的,并且没有问题。然而,由于Haskell是懒惰的,我的谷歌搜索让我明白,(s + x)和(l + 1)将作为thunk传递给递归。所以这整个事情崩溃和烧伤:

 堆栈空间溢出:当前大小8388608字节。 

经过进一步的搜索后,我发现 seq $!。这似乎我不明白,因为我所有在这种情况下使用它们的尝试都是徒劳的,错误消息提到了无限类型。



最后,我发现 -XBangPatterns ,它通过改变递归调用来解决它:

  go! (x:xs)= go(s + x)(l + 1)xs 

但我对此并不满意,因为 -XBangPatterns 目前是一个扩展。我想知道如何在不使用 -XBangPatterns 的情况下进行严格的评估。 (也许可以学习一些东西!)



为了让你明白我的理解力不足,下面是我的尝试(编译的唯一的尝试,就是这样):

  go sl(x:xs)= go(seq s(s + x))(seq l(l + 1))xs 

据我所知,seq应该在这里强制评估s和l参数,从而避免由thunk引起的问题。但是我仍然遇到堆栈溢出。

解决方案

我已经在这方面写了大量的内容:



首先,是的,如果你想要求对累加器进行严格的评估,使用 seq 并保留在Haskell 98中:

  mean = go 0 0 
where
go sl [] = s / fromIntegral l
go sl(x:xs)= s`seq` l `seq`
go(s + x)(l + 1)xs

main = print $ mean [0..10000000]

* Main>主要
5000000.0

其次:如果您给出某些类型的注释,严格分析将会启动;使用-O2进行编译:

  mean:[Double]  - > Double 
mean = go 0 0
where
go :: Double - > Int - > [Double] - > Double
go sl [] = s / fromIntegral l
go sl(x:xs)= go(s + x)(l + 1)xs

main = print $意思是[0..10000000]

$ ghc -O2 --make A.hs
[1]编译Main(A.hs,Ao)
链接A。 ..

$时间./A
5000000.0
./A 0.46s用户0.01s系统99%cpu 0.470总额

因为'Double'是严格原子类型Double#的一个包装,通过对优化和精确类型的优化,GHC运行严格性分析并推断严格
$ b

  import Data.Array.Vector 

main = print(mean(enumFromToFracU 1 10000000))

data Pair = Pair!Int!Double

mean :: UArr Double - > Double
mean xs = s / fromIntegral n
其中
Pair ns = foldlU k(Pair 0 0)xs
k(Pair ns)x = Pair(n + 1)(s + x)

$ ghc -O2 --make A.hs -funbox-strict-fields
[1的1]编译Main(A.hs,Ao)
链接A ...

$时间./A
5000000.5
./A 0.03s用户0.00s系统96%cpu 0.038总额



如上面RWH章节所述。

I have this fairly simple function to compute the mean of elements of a big list, using two accumulators to hold the sum so far and the count so far:

mean = go 0 0
    where
      go s l []     = s / fromIntegral l
      go s l (x:xs) = go (s+x) (l+1) xs

main = do
  putStrLn (show (mean [0..10000000]))

Now, in a strict language, this would be tail-recursive, and there would be no problem. However, as Haskell is lazy, my googling has led me to understand that (s+x) and (l+1) will be passed down the recursion as thunks. So this whole thing crashes and burns:

Stack space overflow: current size 8388608 bytes.

After further googling, I found seq and $!. Which it seems I don't understand because all my attempts at using them in this context proved futile, with error messages saying something about infinite types.

Finally I found -XBangPatterns, which solves it all by changing the recursive call:

go !s !l (x:xs) = go (s+x) (l+1) xs

But I'm not happy with this, as -XBangPatterns is currently an extension. I would like to know how to make the evaluation strict without the use of -XBangPatterns. (And maybe learn something too!)

Just so you understand my lack of understanding, here's what I tried (the only try that compiled, that is):

go s l (x:xs) = go (seq s (s+x)) (seq l (l+1)) xs

From what I could understand, seq should here force the evaluation of the s and l argument, thus avoiding the problem caused by thunks. But I still get a stack overflow.

解决方案

I've written extensively on this:

Firstly, yes, if you want to require strict evaluation of the accumulators use seq and stay in Haskell 98:

mean = go 0 0
  where
    go s l []     = s / fromIntegral l
    go s l (x:xs) = s `seq` l `seq`
                      go (s+x) (l+1) xs

main = print $ mean [0..10000000]

*Main> main
5000000.0

Secondly: strictness analysis will kick in if you give some type annotations, and compile with -O2:

mean :: [Double] -> Double
mean = go 0 0
 where
  go :: Double -> Int -> [Double] -> Double
  go s l []     = s / fromIntegral l
  go s l (x:xs) = go (s+x) (l+1) xs

main = print $ mean [0..10000000]

$ ghc -O2 --make A.hs
[1 of 1] Compiling Main             ( A.hs, A.o )
Linking A ...

$ time ./A
5000000.0
./A  0.46s user 0.01s system 99% cpu 0.470 total

Because 'Double' is a wrapper over the strict atomic type Double#, with optimizations on, and a precise type, GHC runs strictness analysis and infers that the strict version will be ok.

import Data.Array.Vector

main = print (mean (enumFromToFracU 1 10000000))

data Pair = Pair !Int !Double

mean :: UArr Double -> Double   
mean xs = s / fromIntegral n
  where
    Pair n s       = foldlU k (Pair 0 0) xs
    k (Pair n s) x = Pair (n+1) (s+x)

$ ghc -O2 --make A.hs -funbox-strict-fields
[1 of 1] Compiling Main             ( A.hs, A.o )
Linking A ...

$ time ./A
5000000.5
./A  0.03s user 0.00s system 96% cpu 0.038 total

As described in the RWH chapter above.

这篇关于Haskell中的懒惰和尾递归,为什么会崩溃?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆