是否可能由于您的代码而在Python中发生实际的内存泄漏? [英] Is it possible to have an actual memory leak in Python because of your code?

查看:81
本文介绍了是否可能由于您的代码而在Python中发生实际的内存泄漏?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我没有代码示例,但我很好奇是否有可能编写导致本质上内存泄漏的Python代码.

I don't have a code example, but I'm curious whether it's possible to write Python code that results in essentially a memory leak.

推荐答案

有可能,是的.

这取决于您正在谈论的是哪种内存泄漏.在纯python代码中,不可能像C语言那样忘记释放"内存,但是有可能将引用悬挂在某个地方.这样的一些例子:

It depends on what kind of memory leak you are talking about. Within pure python code, it's not possible to "forget to free" memory such as in C, but it is possible to leave a reference hanging somewhere. Some examples of such:

一个未处理的回溯对象,即使该函数不再运行,它也可以使整个堆栈帧保持活动状态

while game.running():
    try:
        key_press = handle_input()
    except SomeException:
        etype, evalue, tb = sys.exc_info()
        # Do something with tb like inspecting or printing the traceback

在这个愚蠢的游戏循环示例中,我们为本地人分配了"tb".我们本来有很好的意图,但是此tb包含有关handle_input中一直发生的所有堆栈的帧信息,一直到调用它为止.假设您的游戏继续进行,则即使在您下次调用handle_input时,此"tb"也将保持有效,甚至可能永远存在.现在, docs的exc_info 讨论了这个潜在的循环引用问题,并且如果您绝对不需要tb,则建议不要分配它.如果您需要回溯,请考虑 traceback.format_exc

In this silly example of a game loop maybe, we assigned 'tb' to a local. We had good intentions, but this tb contains frame information about the stack of whatever was happening in our handle_input all the way down to what this called. Presuming your game continues, this 'tb' is kept alive even in your next call to handle_input, and maybe forever. The docs for exc_info now talk about this potential circular reference issue and recommend simply not assigning tb if you don't absolutely need it. If you need to get a traceback consider e.g. traceback.format_exc

在类或全局范围(而不是实例范围)中存储值,而不实现它们.

这可能以阴险的方式发生,但通常是在类范围内定义可变类型时发生的.

This one can happen in insidious ways, but often happens when you define mutable types in your class scope.

class Money(object):
    name = ''
    symbols = []   # This is the dangerous line here

    def set_name(self, name):
        self.name = name

    def add_symbol(self, symbol):
        self.symbols.append(symbol)

在上面的示例中,说您做了

In the above example, say you did

m = Money()
m.set_name('Dollar')
m.add_symbol('$')

您可能会很快找到这个特殊错误,但是在这种情况下,您将可变值放在类范围内,即使您在实例范围内正确访问了它,它实际上也是陷入困境" 类对象__dict__.

You'll probably find this particular bug quickly, but in this case you put a mutable value at class scope and even though you correctly access it at instance scope, it's actually "falling through" to the class object's __dict__.

在某些情况下(例如持有对象)使用此方法可能会导致导致应用程序堆永久增长的事情,并会导致生产Web应用程序偶尔不会重新启动其过程的问题.

This used in certain contexts like holding objects could potentially cause things that cause your application's heap to grow forever, and would cause issues in say, a production web application that didn't restart its processes occasionally.

类中的循环引用也具有__del__方法.

Cyclic references in classes which also have a __del__ method.

具有讽刺意味的是,__del__的存在使循环垃圾收集器无法清理实例.假设您在某处想要做一个析构函数以用于最终确定:

Ironically, the existence of a __del__ makes it impossible for the cyclic garbage collector to clean an instance up. Say you had something where you wanted to do a destructor for finalization purposes:

class ClientConnection(...):
    def __del__(self):
        if self.socket is not None:
            self.socket.close()
            self.socket = None

现在,它可以单独使用,并且可能会导致您相信它是OS资源的良好管家,以确保套接字被处置".

Now this works fine on its own, and you may be led to believe it's being a good steward of OS resources to ensure the socket is 'disposed' of.

但是,如果ClientConnection保留了对User的引用,而User保留了对连接的引用,那么您可能会想说在清理时,让用户取消对连接的引用. 这实际上是缺陷,但是:循环GC不能无法知道正确的操作顺序,因此无法清理.

However, if ClientConnection kept a reference to say, User and User kept a reference to the connection, you might be tempted to say that on cleanup, let's have user de-reference the connection. This is actually the flaw, however: the cyclic GC doesn't know the correct order of operations and cannot clean it up.

此问题的解决方案是确保您进行清理,例如通过调用某种close断开事件连接,但将该方法命名为__del__以外的名称.

The solution to this is to ensure you do cleanup on say, disconnect events by calling some sort of close, but name that method something other than __del__.

实施不好的C扩展,或未按预期使用C库.

在Python中,您信任垃圾收集器丢弃不使用的东西.但是,如果您使用包装了C库的C扩展名,则大多数时候您将负责确保显式关闭或取消分配资源.多数情况下,这已被记录在案,但是习惯于不必进行这种显式取消分配的python程序员可能会在不知道资源被占用的情况下,将句柄(例如从函数等返回)扔给了该库.

In Python, you trust in the garbage collector to throw away things you aren't using. But if you use a C extension that wraps a C library, the majority of the time you are responsible for making sure you explicitly close or de-allocate resources. Mostly this is documented, but a python programmer who is used to not having to do this explicit de-allocation might throw away the handle (like returning from a function or whatever) to that library without knowing that resources are being held.

包含闭包的合并范围比您预期的要多得多

class User:
    def set_profile(self, profile):
        def on_completed(result):
            if result.success:
                self.profile = profile

        self._db.execute(
            change={'profile': profile},
            on_complete=on_completed
        )

在这个人为的示例中,我们似乎正在使用某种异步"调用,当数据库调用完成时,该调用将在on_completed处将我们回调(实现是可以实现的,最终会得到相同的结果)结果).

In this contrived example, we appear to be using some sort of 'async' call that will call us back at on_completed when the DB call is done (the implementation could've been promises, it ends up with the same outcome).

您可能没有意识到,on_completed闭包将对self的引用绑定到了执行self.profile的赋值.现在,也许数据库客户端会跟踪活动查询和指向闭包完成时要调用的指针(因为它是异步的),并说它由于任何原因而崩溃.如果数据库客户端未正确清除回调等,则在这种情况下,数据库客户端现在具有对on_completed的引用,该引用具有对User的引用,该引用保留_db-您现在已经创建了一个可能永远无法获取的循环引用收集.

What you may not realize is that the on_completed closure binds a reference to self in order to execute the self.profile assignment. Now, perhaps the DB client keeps track of active queries and pointers to the closures to call when they're done (since it's async) and say it crashes for whatever reason. If the DB client doesn't correctly cleanup callbacks etc, in this case, the DB client now has a reference to on_completed which has a reference to User which keeps a _db - you've now created a circular reference that may never get collected.

(即使没有循环引用,闭包也会绑定本地变量,甚至有时还会导致实例,这可能导致您认为收集的值存在很长时间,包括套接字,客户端,大缓冲区和整个事物树) )

(Even without a circular reference, the fact that closures bind locals and even instances sometimes may cause values you thought were collected to be living for a long time, which could include sockets, clients, large buffers, and entire trees of things)

可变类型的默认参数

def foo(a=[]):
    a.append(time.time())
    return a

这是一个人为的示例,但是可能会导致人们相信a的默认值是一个空列表,意味着要附加到它,而实际上它是对 same 的引用列表.这又可能导致无限制的增长,而您却不知道自己做了.

This is a contrived example, but one could be led to believe that the default value of a being an empty list means append to it, when it is in fact a reference to the same list. This again might cause unbounded growth without knowing that you did that.

这篇关于是否可能由于您的代码而在Python中发生实际的内存泄漏?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆