请问python会自动垃圾收集双向链表? [英] Will python automatically garbage collect doubly-linked list?

查看:129
本文介绍了请问python会自动垃圾收集双向链表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景



我有一个树结构。在这个树结构中,我将一个节点的孩子维护为一个双向链表:



(来源:双链接列表



(我之所以选择这种结构,是因为创建此列表的广度优先搜索方法。)



问题



现在我关心的是如果垃圾收集器可以自动销毁这个列表。当然,我只保留对这三个根节点的引用。 Afaik GC的原理是,它收集内存中的数据结构,并不指出任何参考。但是在双向链表中,每个节点都是从它的兄弟节点引用的,并且兄弟节点引用节点。因此,总会引用一个节点,并且GC永远不会收集它。



垃圾回收器是否会处理双向链表?



如果不是,最简单的收集方法是什么? 相关问题:

为什么Lua使用垃圾收集器而不是引用计数?

Python:修改列表时的内存使用和优化


<每个Python实现都有一个不同的垃圾收集方案。通用答案是是的,如果它是垃圾,它应该被垃圾收集。但您大概想要的东西比这更具体。






在CPython中,垃圾回收使用refcounting和循环收集器。如果一个对象的refcount降到0,它就会被清理干净。但在你的情况下,当你的列表的所有外部引用消失时,仍然会有内部引用,所以本身的refcount不能解决你的问题。这就是循环收集器的用途。假设您的节点没有 __ del __ 方法,并且您没有(直接或间接)禁用补充垃圾集合(它默认为开启),循环收集器将检测到你的节点都相互引用,但没有其他引用它们,并清理它。 (这可能需要两遍,因为它使用的是分代系统。)



您可以使用 gc 模块来明确运行循环收集器( gc.collect())而不是等待它,或者检查它正在做什么。例如,如果您这样做:

  gc.collect()
oldcounts = gc.get_counts()
del last_reference_to_list
gc.collect()
newcounts = gc.get_counts()
print(oldcounts,newcounts)

......你应该能够说出(不是完美的可靠性,但是对于学习和测试来说已经足够了),你的节点全都没有了。






如果节点 do __ del __ 方法会怎样?然后你必须给GC一些帮助。你需要做的是打破任何包含 __ del __ 方法的对象的循环。如果在列表之间没有任何节点共享,那么显而易见的方法就是遍历列表和 del 前向和后向指针。 (从技术上讲,你只需要 del 其中一个或另一个,但你也可以这样做。)如果你需要 __ del __ 方法,你可能需要一个在顶层 dl_list (或者 tree_node 或者任何它这是一个显而易见的地方。



当然如果你不需要 __ del __




最后一种可能性是使用方法,还有一个更简单的解决方案: a href =http://docs.python.org/2/library/weakref.html =noreferrer> weakref 为反向链接,但是对于前向链接的定期参考。这样,就没有可能的循环。但是你必须小心地添加和删除节点,以确保你永远不会暂时留下一个节点,只有一个弱的参考。






如果您使用的是Jython或IronPython,则垃圾收集与底层运行时(JVM或.NET)绑定,因此您必须阅读相应的文档。



PyPy有它自己的垃圾收集器(实际上,可以选择不同的选项),您可以阅读 here。



如果你使用的是一个不太常见的实现,应该有类似的文档可用。 b $ b

Background

I'm having a tree structure. Within this tree structure I am maintaining kids of a node as a doubly-linked list:


(source:
Doubly linked list)

(I chose this structure due to breadth-first search method of creating this list.)

Problem

Now my concern is if garbage collector can automatically destroy this list. Naturally I keep only the reference to the root node of such three. Afaik the principle of GC is that it collects data structures in memory, to whose does not point any reference. But in doubly-linked list each node is referenced from it's sibling and the sibling references the node. So there will be always reference to a node and the GC would never collect it.

Will garbage collector handle doubly-linked list?

If not, what is the easiest way to collect it?

Related questions:

Why does Lua use a garbage collector instead of reference counting?
Python: Memory usage and optimization when modifying lists

解决方案

Each Python implementation has a different garbage collection scheme. The general-purpose answer is "Yes, if it's garbage, it should be garbage collected." But you presumably want something more specific than this.


In CPython, the garbage collection uses refcounting, plus a cycle collector. If an object's refcount drops to 0, it gets cleaned up. But in your case, when all external references to your list go away, there will still be internal references, so refcounting by itself cannot solve your problem. That's what the cycle collector is for.

Assuming your nodes do not have __del__ methods, and you have not (directly or indirectly) disabled "supplemental garbage collection" (it's on by default), the cycle collector will detect that your nodes all refer to each other, but nothing else refers to them, and clean it up. (This could take two passes, because it uses a generational system.)

You can use the gc module to explicitly run the cycle collector (gc.collect()) instead of waiting for it, or to inspect what it's doing. For example, if you do this:

gc.collect()
oldcounts = gc.get_counts()
del last_reference_to_list
gc.collect()
newcounts = gc.get_counts()
print(oldcounts, newcounts)

… you should be able to tell (not with perfect reliability, but well enough for learning and testing purposes) that your nodes are all gone.


What if your nodes do have __del__ methods? Then you will have to give the GC some help. What you need to do is break any cycles that include objects with __del__ methods. The obvious way to do that, if you don't have any node-sharing between lists, is to just walk the list and del the forward and back pointers. (Technically, you only need to del one or the other, but you might as well do both.) If you need the __del__ method on the nodes, you probably need one on the top-level dl_list (or tree_node or whatever it is that owns these), so that's an obvious place to put it.

Of course if you don't need the __del__ method, there's an even easier solution: just get rid of it.


One last possibility is to use weakref for the back links, but regular references for the forward links. That way, there are no possible cycles. But you will have to be a bit careful adding and removing nodes to make sure you never temporarily leave a node with nothing but a weakref to it.


If you're using Jython or IronPython, the garbage collection is tied to the underlying runtime (JVM or .NET), so you will have to read the appropriate documentation.

PyPy has its own garbage collector (actually, a choice of different options), which you can read about here.

If you're using a less-common implementation, there should be similar docs available.

这篇关于请问python会自动垃圾收集双向链表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆