为什么这个Haskell代码没有耗尽堆? [英] Why doesn't this Haskell code exhaust the heap?
问题描述
此代码将在顶层指定的两个(相互抵消的)无限列表的内容相加:
This code sums the contents of two (mutually cancelling) infinite lists specified at the top level:
{-# language BangPatterns #-}
module Main where
unending1 :: [Int]
unending1 = cycle [1]
unending2 :: [Int]
unending2 = cycle [negate 1]
main :: IO ()
main = do
let summator :: Int -> [Int] -> [Int] -> Int
summator !acc (i1 : rest1) (i2 : rest2) =
if acc > 100
then acc -- never happens
else summator (acc+i1+i2) rest1 rest2
print (summator 0 unending1 unending2)
我在没有优化的情况下编译了这段代码,并且堆大小很小,就像这样:
I compile this code with no optimizations and a low heap size, like this:
ghc -O0 -with-rtsopts="-M10m" Main.hs
我的直觉是该代码将导致内存泄漏,因为求和函数将尝试物化"这两个列表,并且它们的头都在顶层,因此不会被丢弃
My intuition is that this code would cause a memory leak, because the summation function would try to "materialize" the two lists, and the heads of both of them are at the top-level, so they won't be discarded.
但是,当我执行程序时,它似乎可以无限期地运行而没有问题.
However when I execute the program it seems to run indefinitely without problems.
我在哪里弄错了?
编辑.使用 -ddump-simpl
检查生成的Core,列表似乎仍位于顶层.例如:
Edit. Inspecting the generated Core using -ddump-simpl
, the lists seem to remain at the top level. For example:
Result size of Tidy Core = {terms: 77, types: 52, coercions: 0}
-- RHS size: {terms: 5, types: 3, coercions: 0}
unending1 :: [Int]
[GblId, Str=DmdType]
unending1 =
cycle
@ Int (GHC.Types.: @ Int (GHC.Types.I# 1#) (GHC.Types.[] @ Int))
推荐答案
正如chi回答的那样,由于 cycle
的定义,您的代码在恒定的空间中运行.
As chi answered, your code runs in constant space because of the definition of cycle
.
但是,即使使用 cycle xs = xs ++ cycle xs
,它也可以在 ghc -O0
的恒定空间中运行,因为顶级thunk(恒定的应用形式,CAF-s)可以被垃圾回收.闭包的信息表具有静态参考表",其中列出了静态闭包,例如
However, it runs in constant space with ghc -O0
even with cycle xs = xs ++ cycle xs
, because top-level thunks (constant applicative forms, CAF-s) can be garbage collected. The info tables of closures have "static reference tables", which list static closures such that
- 关闭代码中提到了它们
- 它们要么是顶级thunk,要么它们的代码可传递地指的是顶级thunk
文档此处.如果无法从GC根目录(包括线程状态对象的堆栈,因此在我们的情况下,正在执行中的 main
的关闭)到达顶级thunk,则将它们指向的堆对象丢弃
Documentation here. If top-level thunks can't be reached from GC roots (which include the stacks of thread-state objects, so in our case the closure of main
under execution), heap objects they point to are discarded.
这篇关于为什么这个Haskell代码没有耗尽堆?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!