Clojure头保留 [英] Clojure head retention

查看:127
本文介绍了Clojure头保留的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读O'Reilly的Clojure编程手册。



我来过一个头部保留的例子。
第一个例子保持引用 d (我假设),所以它不会收集垃圾:

 (let [[td](split-with#(<%12)(range 1e8))] 
[(count d)(count t)])
; =#< OutOfMemoryError java.lang.OutOfMemoryError:Java heap space>

虽然第二个例子不保留它,所以它没有问题:

 (let [[td](split-with#(<%12)(range 1e8))] 
[ count d)])
; = [12 99999988]

这里是什么是保留在哪种情况下和为什么。
如果我尝试只返回 [(count d)] ,例如:

 (let [[td](split-with#(<%12)(range 1e8))] 
[(count d)])



似乎会造成同样的内存问题。



code> count 在每种情况下实现/评估序列。如果我尝试返回(count t),首先,如何更快/更有效的内存,如果我不返回它呢?
什么&

解决方案

在第一个和最后一个例子中,原始序列传递给 split-with 被保留,同时在内存中实现;因此OOME。发生这种情况的方式是间接的;直接保留的是 t ,而原始序列被 t ,延迟seq,处于未实现状态



t 使原始序列保持为如下。在被实现之前, t 是存储thunk的对象,其可以在某一点被调用以实现 t ;这个thunk需要存储一个指向原始序列参数的指针到 split-with 之前它被实现传递给 take-while - 查看 split-with 的实现。一旦实现了 t ,thunk就变为有资格的GC(将 LazySeq 对象中的字段设置为 t )不再持有巨大输入seq的头部。


$ null b $ b

输入seq本身由(count d)完整实现,需要实现 d ,因此原始输入seq。



继续为什么 t 被保留:



在第一种情况下,这是因为(count d) 。因为Clojure从左到右计算这些表达式,所以局部 t 需要挂起第二次调用计数,并且因为它恰巧持有一个巨大的seq



返回(count d)的最后一个示例应该理想地不坚持 t ;



第二个例子发生了很好的工作,因为(count t),则不再需要 t 。 Clojure编译器注意到这一点,并使用一个聪明的技巧使本地复位为 nil ,同时进行 count 。 Java代码的关键部分就像 f(t,t = null),所以当前值 t 被传递给相应的函数,但是在控制被移交给 f 之前,局部被清除,因为这是表达式 t = null 这是 f 的参数;显然这里的Java从左到右的语义是这个工作的关键。



回到最后的例子,这不工作,因为 t 实际上不在任何地方使用,未使用的本地数据不由本地清算过程处理。 (清除在最后一次使用时发生;在程序中没有这样的点,没有清除。)



对于 count 实现延迟序列:它必须这样做,因为没有通用的方式来预测延迟序列的长度,而不意识到它。


I'm reading Clojure Programming book by O'Reilly..

I came over an example of head retention. First example retains reference to d (I presume), so it doesnt get garbage collected:

(let [[t d] (split-with #(< % 12) (range 1e8))]
    [(count d) (count t)])
;= #<OutOfMemoryError java.lang.OutOfMemoryError: Java heap space>

While second example doesnt retain it, so it goes with no problem:

(let [[t d] (split-with #(< % 12) (range 1e8))]
    [(count t) (count d)])
;= [12 99999988]

What I don't get here is what exactly is retained in which case and why. If I try to return just [(count d)], like this:

(let [[t d] (split-with #(< % 12) (range 1e8))]
    [(count d)])

it seems to create same memory problem.

Further, I recall reading that count in every case realizes/evaluates a sequence. So, i need that clarified.

If I try to return (count t) first, how is that faster/more memory efficient then if I dont return it at all? And what & why gets retained in which case?

解决方案

In both the first and the final examples the original sequence passed to split-with is retained while being realized in full in memory; hence the OOME. The way this happens is indirect; what is retained directly is t, while the original sequence is being held onto by t, a lazy seq, in its unrealized state.

The way t causes the original sequence to be held is as follows. Prior to being realized, t is a LazySeq object storing a thunk which may be called upon at some point to realize t; this thunk needs to store a pointer to the original sequence argument to split-with before it is realized to pass it on to take-while -- see the implementation of split-with. Once t is realized, the thunk becomes eligible for GC (the field which holds it in the LazySeq object is set to null) at t no longer holds the head of the huge input seq.

The input seq itself is being realized in full by (count d), which needs to realize d, and thus the original input seq.

Moving on to why t is being retained:

In the first case, this is because (count d) gets evaluated before (count t). Since Clojure evaluates these expressions left to right, the local t needs to hang around for the second call to count, and since it happens to hold on to a huge seq (as explained above), that leads to the OOME.

The final example where only (count d) is returned should ideally not hold on to t; the reason that is not the case is somewhat subtle and best explained by referring to the second example.

The second example happens to work fine, because after (count t) is evaluated, t is no longer needed. The Clojure compiler notices this and uses a clever trick to have the local reset to nil simultaneously with the count call being made. The crucial piece of Java code does something like f(t, t=null), so that the current value of t is passed to the appropriate function, but the local is cleared before control is handed over to f, since this happens as a side effect of the expression t=null which is an argument to f; clearly here Java's left-to-right semantics are key to this working.

Back to the final example, this doesn't work, because t is not actually used anywhere and unused locals are not handled by the locals clearing process. (The clearing happens at the point of last use; in absence of such a point in the program, there is no clearing.)

As for count realizing lazy sequences: it must do that, as there is no general way of predicting the length of a lazy seq without realizing it.

这篇关于Clojure头保留的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆