Haskell是否需要垃圾收集器? [英] Does Haskell require a garbage collector?

查看:111
本文介绍了Haskell是否需要垃圾收集器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很好奇为什么Haskell实现使用GC。



我不能想到GC在纯语言中是必需的情况。它只是一个优化,以减少复制,或者它实际上是必要的吗?



我正在寻找代码泄漏,如果GC不存在。正如其他人已经指出的那样,Haskell需要自动 >动态内存管理:自动内存管理是必要的,因为手动内存管理是不安全的;动态内存管理是必要的,因为对于某些程序,只能在运行时确定对象的生命周期。



例如,请考虑以下程序:

  main = loop (Just [1..1000])其中
loop :: Maybe [Int] - > IO()
loop obj = do
print obj
resp< - getLine
if resp ==clear
then循环Nothing
else循环obj

在这个程序中, [1..1000] code>必须保存在内存中,直到用户键入清除;所以这个的生命周期必须动态确定,这就是为什么动态内存管理是必要的。

所以在这个意义上,自动动态内存分配是必须的,实际上这意味着:,Haskell需要一个垃圾回收器,因为垃圾回收是性能最高的自动动态内存管理器。



然而... ...

虽然垃圾收集器是必要的,但我们可能会尝试找到一些特殊情况,其中编译器可以使用比垃圾收集更便宜的内存管理方案。例如,给定

  f :: Integer  - > Integer 
fx = let x2 = x * x in x2 * x2

我们可能希望编译器检测 x2 可以安全地在 f 返回时解除分配(而不是等待垃圾收集器释放 X2 )。本质上,我们要求编译器执行转义分析,将分配到垃圾收集堆转换为堆栈中分配



这不是太不合理的要求: jhc haskell编译器这样做,尽管GHC没有。 Simon Marlow ,GHC的世代垃圾收集器使得逃生分析大多不必要。

jhc实际上使用了一种复杂的逃生分析形式,称为区域推断。考虑

  f :: Integer  - > (整数,整数)
f x = let x2 = x * x in(x2,x2 + 1)

g :: Integer - >整数
g x =(y,z)的情况f x - > y + z

在这种情况下,简单的逃逸分析可以得出结论: x2 从 f (因为它在元组中返回),因此 x2 必须被分配到垃圾收集堆上。另一方面,区域推断可以检测 x2 可以在 g 返回时解除分配;这里的想法是应该在 g 的区域中分配 x2 ,而不是 f 的区域。



Beyond Haskell



区域推断在某些情况下很有用如上所述,似乎很难与惰性评估有效协调(参见 Edward Kmett's Simon Peyton Jones 的评论)。例如,考虑

  f :: Integer  - >整数
fn =产品[1..n]

可能会试图分配列表 [1..n] 并在 f 返回后将其解除分配,但这会是灾难性的:它会改变 f 从使用O(1)内存(垃圾收集下)到O(n)内存。

在二十世纪九十年代和二十一世纪初,对 strict 功能语言ML进行了区域推理。 Mads Tofte,Lars Birkedal,Martin Elsman,Niels Hallenberg写了一本相当可读的回顾他们在区域推理方面的工作,其中大部分都集成到 MLKit编译器中。他们尝试了纯粹的基于区域的内存管理(即没有垃圾收集器)以及混合的基于区域/垃圾回收的内存管理,并报告说他们的测试程序比纯垃圾回收运行速度快10倍和慢4倍收集版本。


I'm curious as to why Haskell implementations use a GC.

I can't think of a case where GC would be necessary in a pure language. Is it just an optimization to reduce copying, or is it actually necessary?

I'm looking for example code that would leak if a GC wasn't present.

解决方案

As others have already pointed out, Haskell requires automatic, dynamic memory management: automatic memory management is necessary because manual memory management is unsafe; dynamic memory management is necessary because for some programs, the lifetime of an object can only be determined at runtime.

For example, consider the following program:

main = loop (Just [1..1000]) where
  loop :: Maybe [Int] -> IO ()
  loop obj = do
    print obj
    resp <- getLine
    if resp == "clear"
     then loop Nothing
     else loop obj

In this program, the list [1..1000] must be kept in memory until the user types "clear"; so the lifetime of this must be determined dynamically, and this is why dynamic memory management is necessary.

So in this sense, automated dynamic memory allocation is necessary, and in practice this means: yes, Haskell requires a garbage collector, since garbage collection is the highest-performance automatic dynamic memory manager.

However...

Although a garbage collector is necessary, we might try to find some special cases where the compiler can use a cheaper memory management scheme than garbage collection. For instance, given

f :: Integer -> Integer
f x = let x2 = x*x in x2*x2

we might hope for the compiler to detect that x2 can safely be deallocated when f returns (rather than waiting for the garbage collector to deallocate x2). Essentially, we are asking that the compiler perform escape analysis to convert allocations in to garbage-collected heap to allocations on the stack wherever possible.

This is not too unreasonable to ask for: the jhc haskell compiler does this, although GHC does not. Simon Marlow says that GHC's generational garbage collector makes escape analysis mostly unnecessary.

jhc actually uses a sophisticated form of escape analysis known as region inference. Consider

f :: Integer -> (Integer, Integer)
f x = let x2 = x * x in (x2, x2+1)

g :: Integer -> Integer
g x = case f x of (y, z) -> y + z

In this case, a simplistic escape analysis would conclude that x2 escapes from f (because it is returned in the tuple), and hence x2 must be allocated on the garbage-collected heap. Region inference, on the other hand, is able to detect that x2 can be deallocated when g returns; the idea here is that x2 should be allocated in g's region rather than f's region.

Beyond Haskell

While region inference is helpful in certain cases as discussed above, it appears to be difficult to reconcile effectively with lazy evaluation (see Edward Kmett's and Simon Peyton Jones' comments). For instance, consider

f :: Integer -> Integer
f n = product [1..n]

One might be tempted to allocate the list [1..n] on the stack and deallocate it after f returns, but this would be catastrophic: it would change f from using O(1) memory (under garbage collection) to O(n) memory.

Extensive work was done in the 1990s and early 2000s on region inference for the strict functional language ML. Mads Tofte, Lars Birkedal, Martin Elsman, Niels Hallenberg have written a quite readable retrospective on their work on region inference, much of which they integrated into the MLKit compiler. They experimented with purely region-based memory management (i.e. no garbage collector) as well as hybrid region-based/garbage-collected memory management, and reported that their test programs ran "between 10 times faster and 4 times slower" than pure garbage-collected versions.

这篇关于Haskell是否需要垃圾收集器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆