Python:垃圾回收器是否在引发MemoryError之前运行? [英] Python: is the garbage collector run before a MemoryError is raised?

查看:80
本文介绍了Python:垃圾回收器是否在引发MemoryError之前运行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一个Python代码中,该代码迭代了涉及内存和CPU密集型数值计算的30个问题,我观察到Python进程的内存消耗随着30次迭代的开始而增加了约800MB,最后在第8次迭代中提高MemoryError(实际上,系统的内存已耗尽).但是,如果我import gc并让gc.collect()在每次迭代后运行,那么内存消耗将保持恒定在〜2.5GB,并且Python代码在解决所有30个问题后会很好地终止.该代码仅使用2个连续问题的数据,并且没有参考周期(否则,手动垃圾回收也将无法降低内存消耗).

In a Python code that iterates over a sequence of 30 problems involving memory- and CPU-intense numerical computations, I observe that the memory consumption of the Python process grows by ~800MB with the beginning of each of the 30 iterations and finally raises an MemoryError in the 8th iteration (where the system's memory is in fact exhausted). However, if I import gc and let gc.collect() run after each iteration, then the memory consumption remains constant at ~2.5GB and the Python code terminates nicely after solving all 30 problems. The code only uses the data of 2 consecutive problems and there are no reference cycles (otherwise the manual garbage collection would also not be able to keep the memory consumption down).

此行为引发了一个问题,即Python在引发MemoryError之前是否尝试运行垃圾收集器.我认为这是一件非常明智的事情,但也许有理由反对吗?

This behavior raises the question if Python tries to run the garbage collector before it raises an MemoryError. In my opinion, this would be a perfectly sane thing to do but perhaps there are reasons against this?

在此处进行了与上述类似的观察: https://stackoverflow.com/a/4319539/1219479

A similar observation to the above was made here: https://stackoverflow.com/a/4319539/1219479

推荐答案

实际上,存在 个参考周期,这是手动gc.collect()调用能够完全回收内存的唯一原因

Actually, there are reference cycles, and it's the only reason why the manual gc.collect() calls are able to reclaim memory at all.

在Python中(我在这里假设使用CPython),垃圾收集器的唯一目的是打破参考周期.如果不存在任何对象,则该对象将被销毁,并在丢失对它们的最后引用的确切时刻收回其内存.

In Python (I'm assuming CPython here), the garbage collector's sole purpose is to break reference cycles. When none are present, objects are destroyed and their memory reclaimed at the exact moment the last reference to them is lost.

关于运行垃圾收集器的时间,完整的文档在这里: http://docs.python.org/2/library/gc.html

As for when the garbage collector is run, the full documentation is here: http://docs.python.org/2/library/gc.html

它的TLDR是Python维护对象分配和释放的内部计数器.每当(allocations - deallocations)达到700(阈值0)时,就会运行垃圾回收并重置两个计数器.

The TLDR of it is that Python maintains an internal counter of object allocations and deallocations. Whenever (allocations - deallocations) reaches 700 (threshold 0), a garbage collection is run and both counters are reset.

每次进行收集(自动或使用gc.collect()手动运行)时,都会收集第0代(尚未通过收集的所有对象)(即,遍历没有可访问引用的对象,寻找参考周期-如果找到任何参考周期,则这些周期将被破坏,可能会因为没有剩余参考而导致对象被破坏).该集合之后剩余的所有对象都移到了第1代.

Every time a collection happens (either automatic, or manually run with gc.collect()), generation 0 (all objects that haven't yet survived a collection) is collected (that is, objects with no accessible references are walked through, looking for reference cycles -- if any are found, the cycles are broken, possibly leading to objects being destroyed because there are no references left). All objects that remain after that collection are moved to generation 1.

还将收集第10代的每个10个集合(阈值1),并且将 存活的第1代中的所有对象都移至第2代.第1代的每10个集合(即每个100个馆藏-阈值2),也收集了第2代.保留下来的对象保留在第2代中-没有第3代.

Every 10 collections (threshold 1), generation 1 is also collected, and all objects in generation 1 that survive that are moved to generation 2. Every 10 collections of generation 1 (that is, every 100 collections -- threshold 2), generation 2 is also collected. Objects that survive that are left in generation 2 -- there is no generation 3.

可以通过调用gc.set_threshold(threshold0, threshold1, threshold2)来设置这3个阈值.

These 3 thresholds can be user-set by calling gc.set_threshold(threshold0, threshold1, threshold2).

这对您的程序意味着什么:

What this all means for your program:

  1. GC不是CPython用来回收内存的机制(引用是). GC破坏了死"对象中的参考循环,这可能导致其中一些被破坏.
  2. 不,不能保证在提高MemoryError之前GC将运行.
  3. 您有参考周期.尝试摆脱它们.
  1. The GC is not the mechanism CPython uses to reclaim memory (refcounting is). The GC breaks reference cycles in "dead" objects, which may lead to some of them being destroyed.
  2. No, there are no guarantees that the GC will run before a MemoryError is raised.
  3. You have reference cycles. Try to get rid of them.

这篇关于Python:垃圾回收器是否在引发MemoryError之前运行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆