在Python中丢弃图片时发生内存泄漏 [英] Memory leaks when image discarded in Python

查看:122
本文介绍了在Python中丢弃图片时发生内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在用Python编写一个简单的棋盘游戏,我刚刚意识到,垃圾回收不会在重新加载图像时从内存中清除丢弃的位图数据.仅当启动或加载游戏或分辨率改变时才会发生这种情况,但是它会消耗更多的内存,因此我无法解决此问题.

重新加载图像时,由于所有引用都绑定到与原始图像数据绑定到的变量相同的变量,因此所有引用都将传输到新图像数据.我试图使用collect()强制进行垃圾收集,但这没有帮助.

我写了一个小样本来演示我的问题.

from tkinter import Button, DISABLED, Frame, Label, NORMAL, Tk
from PIL.Image import open
from PIL.ImageTk import PhotoImage

class App(Tk):
    def __init__(self):
        Tk.__init__(self)
        self.text = Label(self, text = "Please check the memory usage. Then push button #1.")
        self.text.pack()
        self.btn = Button(text = "#1", command = lambda : self.buttonPushed(1))
        self.btn.pack()

    def buttonPushed(self, n):
        "Cycle to open the Tab module n times."
        self.btn.configure(state = DISABLED) # disable to prevent paralell cycles
        if n == 100:
            self.text.configure(text = "Overwriting the bitmap with itself 100 times...\n\nCheck the memory usage!\n\nUI may seem to hang but it will finish soon.")
            self.update_idletasks()
        for i in range(n):      # creates the Tab frame whith the img, destroys it, then recreates them to overwrite the previous Frame and prevous img
            b = Tab(self)
            b.destroy()
            if n == 100:
                print(i+1,"percent of processing finished.")
        if n == 1:
            self.text.configure(text = "Please check the memory usage now.\nMost of the difference is caused by the bitmap opened.\nNow push button #100.")
            self.btn.configure(text = "#100", command = lambda : self.buttonPushed(100))
        self.btn.configure(state = NORMAL)  # starting cycles is enabled again       

class Tab(Frame):
    """Creates a frame with a picture in it."""
    def __init__(self, master):
        Frame.__init__(self, master = master)
        self.a = PhotoImage(open("map.png"))    # img opened, change this to a valid one to test it
        self.b = Label(self, image = self.a)
        self.b.pack()                           # Label with img appears in Frame
        self.pack()                             # Frame appears

if __name__ == '__main__':
    a = App()

要运行上面的代码,您将需要一个PNG图像文件.我的map.png的尺寸是1062×1062.作为PNG,它是1.51 MB,而作为位图数据,它是3-3.5 MB.使用大图像可以轻松查看内存泄漏.

运行我的代码时的预期结果:python的进程逐周期消耗内存.当它消耗约500 MB的内存时,它会崩溃,但又开始耗尽内存.

请给我一些如何解决此问题的建议.感谢您的帮助.谢谢你.

解决方案

首先,您肯定没有内存泄漏.如果它在接近500MB时就崩溃",并且永远不会越过,就不可能泄漏.


我的猜测是您根本没有任何问题.

当Python的垃圾收集器清理掉所有东西时(通常在CPython中使用完后立即发生),它实际上并没有真正释放内存到OS.相反,它可以解决问题,以备日后需要时使用.这是故意的-除非您不愿意进行交换,否则重用内存要比保持释放和重新分配要快得多.

此外,如果500MB是虚拟内存,那么在现代的64位平台上就什么也没有.如果它没有映射到物理/驻留内存(或者在计算机空闲时被映射,否则很快就被扔掉了),这不是问题.仅仅是OS拥有有效免费资源的好地方.

更重要的是:是什么让您认为有问题?是否有任何实际症状,或者在程序管理器/活动监视器/顶部/什么让您感到害怕? (如果是后者,请查看其他程序.在我的Mac上,我目前有28个程序正在使用400MB以上的虚拟内存运行,而我正在使用16GB中的11个程序,即使少于3GB如果我启动了Logic,那么收集内存的速度将比Logic可以使用的速度更快;在那之前,为什么操作系统要浪费精力去映射内存(尤其是当它无法确保某些进程无法进行映射时)去要求它以后不使用的内存)?


但是如果存在 实际问题,有两种解决方法.


第一个技巧是在子进程中执行所有内存密集型操作,您可以杀死该进程并重新启动以恢复临时内存(例如,通过使用multiprocessing.Processconcurrent.futures.ProcessPoolExecutor).

这通常会使事情变慢而不是变快.而且,当临时存储器大部分是直接进入GUI的东西,因此必须驻留在主进程中时,显然这并不容易.


另一个选择是弄清楚内存的使用位置,而不是同时保留这么多对象.基本上,有两个部分:

首先,在每个事件处理程序结束之前释放所有可能的东西.这意味着在文件上调用close,或者del设置对象或将对它们的所有引用设置为None,在不可见的GUI对象上调用destroy,并且最重要的是,不存储对事物的引用你不需要(使用PhotoImage后,您实际上是否需要保留它?如果这样做,有什么方法可以按需加载图像?)

接下来,请确保您没有参考周期.在CPython中,只要没有循环,就立即清除垃圾-但是,如果有的话,它们一直存在直到循环检查器运行为止.您可以使用 gc模块进行调查.要做的一件非常快的事情就是经常尝试:

print(gc.get_count())
gc.collect()
print(gc.get_count())

如果看到巨大的水滴,则说明有周期.您必须查看gc.getobjects()gc.garbage的内部,或附加回调,或者仅对代码进行推理即可找到循环的确切位置.对于每个引用,如果您真的不需要双向引用,请摆脱一个引用.如果这样做,请将其中之一更改为weakref.

I'm currently writing a simple board game in Python and I just realized that garbage collection doesn't purge the discarded bitmap data from memory when images are reloaded. It happens only when game is started or loaded or the resolution changes but it multiples the memory consumed so I can't let this problem unsolved.

When images are reloaded all references are transferred to the new image data since it is binded to the same variable as the original image data was binded to. I tried to force the garbage collection by using collect() but it didn't help.

I wrote a small sample to demonstrate my problem.

from tkinter import Button, DISABLED, Frame, Label, NORMAL, Tk
from PIL.Image import open
from PIL.ImageTk import PhotoImage

class App(Tk):
    def __init__(self):
        Tk.__init__(self)
        self.text = Label(self, text = "Please check the memory usage. Then push button #1.")
        self.text.pack()
        self.btn = Button(text = "#1", command = lambda : self.buttonPushed(1))
        self.btn.pack()

    def buttonPushed(self, n):
        "Cycle to open the Tab module n times."
        self.btn.configure(state = DISABLED) # disable to prevent paralell cycles
        if n == 100:
            self.text.configure(text = "Overwriting the bitmap with itself 100 times...\n\nCheck the memory usage!\n\nUI may seem to hang but it will finish soon.")
            self.update_idletasks()
        for i in range(n):      # creates the Tab frame whith the img, destroys it, then recreates them to overwrite the previous Frame and prevous img
            b = Tab(self)
            b.destroy()
            if n == 100:
                print(i+1,"percent of processing finished.")
        if n == 1:
            self.text.configure(text = "Please check the memory usage now.\nMost of the difference is caused by the bitmap opened.\nNow push button #100.")
            self.btn.configure(text = "#100", command = lambda : self.buttonPushed(100))
        self.btn.configure(state = NORMAL)  # starting cycles is enabled again       

class Tab(Frame):
    """Creates a frame with a picture in it."""
    def __init__(self, master):
        Frame.__init__(self, master = master)
        self.a = PhotoImage(open("map.png"))    # img opened, change this to a valid one to test it
        self.b = Label(self, image = self.a)
        self.b.pack()                           # Label with img appears in Frame
        self.pack()                             # Frame appears

if __name__ == '__main__':
    a = App()

To run the code above you will need a PNG image file. My map.png's dimensions are 1062×1062. As a PNG it is 1.51 MB and as bitmap data it is about 3-3.5 MB. Use a large image to see the memory leak easily.

Expected result when you run my code: python's process eats up memory cycle by cycle. When it consumes approximately 500 MB it collapses but starts to eat up the memory again.

Please give me some advice how to solve this issue. I'm grateful for every help. Thank you. in advance.

解决方案

First, you definitely do not have a memory leak. If it "collapses" whenever it gets near 500MB and never crosses it, it can't possibly be leaking.


And my guess is that you don't have any problem at all.

When Python's garbage collector cleans things up (which generally happens immediately when you're done with it in CPython), it generally doesn't actually release the memory to the OS. Instead, it keeps it around in case you need it later. This is intentional—unless you're thrashing swap, it's a whole lot faster to reuse memory than to keep freeing and reallocating it.

Also, if 500MB is virtual memory, that's nothing on a modern 64-bit platform. If it's not mapped to physical/resident memory (or is mapped if the computer is idle, but quickly tossed otherwise), it's not a problem; it's just the OS being nice with resources that are effectively free.

More importantly: What makes you think there's a problem? Is there any actual symptom, or just something in Program Manager/Activity Monitor/top/whatever that scares you? (If the latter, take a look at the of the other programs. On my Mac, I've got 28 programs currently running using over 400MB of virtual memory, and I'm using 11 out of 16GB, even though less than 3GB is actually wired. If I, say, fire up Logic, the memory will be collected faster than Logic can use it; until then, why should the OS waste effort unmapping memory (especially when it has no way to be sure some processes won't go ask for that memory it wasn't using later)?


But if there is a real problem, there are two ways to solve it.


The first trick is to do everything memory-intensive in a child process that you can kill and restart to recover the temporary memory (e.g., by using multiprocessing.Process or concurrent.futures.ProcessPoolExecutor).

This usually makes things slower rather than faster. And it's obviously not easy to do when the temporary memory is mostly things that go right into the GUI, and therefore have to live in the main process.


The other option is to figure out where the memory's being used and not keep so many objects around at the same time. Basically, there are two parts to this:

First, release everything possible before the end of each event handler. This means calling close on files, either deling objects or setting all references to them to None, calling destroy on GUI objects that aren't visible, and, most of all, not storing references to things you don't need. (Do you actually need to keep the PhotoImage around after you use it? If you do, is there any way you can load the images on demand?)

Next, make sure you have no reference cycles. In CPython, garbage is cleaned up immediately as long as there are no cycles—but if there are, they sit around until the cycle checker runs. You can use the gc module to investigate this. One really quick thing to do is try this every so often:

print(gc.get_count())
gc.collect()
print(gc.get_count())

If you see huge drops, you've got cycles. You'll have to look inside gc.getobjects() and gc.garbage, or attach callbacks, or just reason about your code to find exactly where the cycles are. For each one, if you don't really need references in both directions, get rid of one; if you do, change one of them into a weakref.

这篇关于在Python中丢弃图片时发生内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆