是否可以复制一个thunk以提高内存性能? [英] Can a thunk be duplicated to improve memory performance?

查看:90
本文介绍了是否可以复制一个thunk以提高内存性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Haskell中进行懒惰评估的一个挣扎是推理内存使用的困难。我认为复制thunk的能力会让我更容易。这里有一个例子。



让我们创建一个真正的大列表:

 让xs = [1..10000000] 

现在,我们来创建一个坏函数:

  bad = do 
print $ foldl1'(+)xs
print $ length xs

没有优化,就会吃掉几十MB的ram。垃圾收集器无法在折叠期间释放xs,因为它将在稍后计算长度时需要。



是否有可能重新实现此功能,如下所示:

(+)

  good = do 
(xs1,xs2)< - copyThunk xs
print $ foldl1' xs1
print $ length xs2

现在,xs1和xs2会代表相同的值,但在内存中也彼此独立,所以垃圾收集器可以在防止内存浪费的折叠期间释放。 (虽然我认为这会稍微增加计算成本)

显然,在这个微不足道的例子中,重构代码很容易解决这个问题,但它似乎并不总是明显的如何重构。或者有时重构会大大降低代码的清晰度。

解决方案

我刚才想知道同样的事情,并创建了这样的原型实现一个thunk-duplication功能。您可以在我的预印本 dup-在haskell中显式取消分享中查看结果,并查看代码在 http://darcs.nomeata.de/ghc-dup 。不幸的是,这篇论文今年没有被接受参加Haskell研讨会和Haskell实施者研讨会。

据我所知,这个问题没有现成的解决方案;只有脆弱的解决方法可以作为单位参数技巧,可能因一个或其他编译器优化而中断。


One of my struggles with lazy evaluation in Haskell is the difficulty of reasoning about memory usage. I think the ability to duplicate a thunk would make this much easier for me. Here's an example.

Let's create a really big list:

let xs = [1..10000000]

Now, let's create a bad function:

bad = do
    print $ foldl1' (+) xs
    print $ length xs

With no optimizations, this eats up a few dozen MB of ram. The garbage collector can't deallocate xs during the fold because it will be needed for calculating the length later.

Is it possible to reimplement this function something like this:

good = do
    (xs1,xs2) <- copyThunk xs
    print $ foldl1' (+) xs1
    print $ length xs2

Now, xs1 and xs2 would represent the same value, but also be independent of each other in memory so the garbage collector can deallocate during the fold preventing memory wasting. (I think this would slightly increase the computational cost though?)

Obviously in this trivial example, refactoring the code could easily solve this problem, but It seems like it's not always obvious how to refactor. Or sometimes refactoring would greatly reduce code clarity.

解决方案

I was wondering the same thing a while ago and created a prototypical implementation of such a thunk-duplication function. You can read about the result in my preprint „dup – Explicit un-sharing in haskell" and see the code at http://darcs.nomeata.de/ghc-dup. Unfortunately, the paper was neither accepted for the Haskell Symposium nor the Haskell Implementors Workshop this year.

To my knowledge, there is no real-world-ready solution to the problem; only fragile work-arounds as the unit parameter trick that might break due to one or the other compiler optimizations.

这篇关于是否可以复制一个thunk以提高内存性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆