使用外部 C DLL 时 Python 中的内存泄漏 [英] Memory leaks in Python when using an external C DLL

查看:109
本文介绍了使用外部 C DLL 时 Python 中的内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 python 模块,它调用一个用 C 编写的 DLL 来编码 XML 字符串.一旦函数返回编码的字符串,它就无法取消分配在此步骤中分配的内存.具体:

I have a python module that calls a DLL written C to encode XML strings. Once the function returns the encoded string, it fails to de-allocate the memory which was allocated during this step. Concretely:

encodeMyString = ctypes.create_string_buffer(4096)

encodeMyString = ctypes.create_string_buffer(4096)

CallEncodingFuncInDLL(encodeMyString, InputXML)

CallEncodingFuncInDLL(encodeMyString, InputXML)

我看过这个thisthis 并尝试调用 gc.collect 但也许由于对象已在外部 DLL 中分配,python gc 没有任何它的记录并且无法删除它.但是由于代码不断调用编码函数,它不断分配内存,最终python进程崩溃.有没有办法分析这种内存使用情况?

I have looked at this, this, and this and have also tried calling the gc.collect but perhaps since the object has been allocated in an external DLL, python gc doesn't have any record of it and fails to remove it. But since the code keeps calling the encoding function, it keeps on allocating memory and eventually the python process crashes. Is there a way to profile this memory usage?

推荐答案

由于您尚未提供有关 DLL 的任何信息,因此这一定会非常含糊,但是……

Since you haven't given any information about the DLL, this will necessarily be pretty vague, but…

Python 无法跟踪由它不知道的外部事物分配的内存.怎么可能?该内存可能是 DLL 的常量段的一部分,或者是用 mmapVirtualAlloc 分配的,或者是更大对象的一部分,或者 DLL 可能只是期望它活着供自己使用.

Python can't track memory allocated by something external that it doesn't know about. How could it? That memory could be part of the DLL's constant segment, or allocated with mmap or VirtualAlloc, or part of a larger object, or the DLL could just be expecting it to be alive for its own use.

任何具有分配和返回新对象的函数的 DLL 都必须具有释放该对象的函数.例如,如果 CallEncodingFuncInDLL 返回一个您负责的新对象,则将有一个类似于 DestroyEncodedThingInDLL 的函数,它接受这样一个对象并释放它.

Any DLL that has a function that allocates and returns a new object has to have a function that deallocates that object. For example, if CallEncodingFuncInDLL returns a new object that you're responsible for, there will be a function like DestroyEncodedThingInDLL that takes such an object and deallocates it.

那么,你什么时候调用这个函数?

So, when do you call this function?

让我们退后一步,让这更具体.假设该函数是普通的 strdup,因此您调用以释放内存的函数是 free.您有两种选择何时调用 free.不,我不知道你为什么要从 Python 调用 strdup,但这是最简单的例子,所以让我们假装它没有用.

Let's step back and make this more concrete. Let's say the function is plain old strdup, so the function you call to free up the memory is free. You have two choices for when to call free. No, I have no idea why you'd ever want to call strdup from Python, but it's about the simplest possible example, so let's pretend it's not useless.

第一个选项是调用strdup,立即将返回值转换为原生Python对象并释放它,之后就不用担心了:

The first option is to call strdup, immediately convert the returned value to a native Python object and free it, and not have to worry about it after that:

newbuf = libc.strdup(mybuf)
s = newbuf.value
libc.free(newbuf)
# now use s, which is just a Python bytes object, so it's GC-able

或者,更好的是,使用自定义的restype 可调用:

Or, better, wrap this up so it's automatic by using a custom restype callable:

def convert_and_free_char_p(char_p):
    try:
        return char_p.value
    finally:
        libc.free(char_p)
libc.strdup.restype = convert_and_free_char_p

s = libc.strdup(mybuf)
# now use s

<小时>

但是有些对象不能那么容易地转换为原生 Python 对象——或者它们可以,但是这样做并不是很有用,因为您需要不断将它们传递回 DLL.在这种情况下,在完成之前您无法清理它.


But some objects can't be converted to a native Python object so easily—or they can be, but it's not very useful to do so, because you need to keep passing them back into the DLL. In that case, you can't clean it up until you're done with it.

最好的方法是将该不透明值包装在一个类中,该类在 close__exit____del__ 或任何看起来合适的.一种很好的方法是使用 @contextmanager:

The best way to do this is to wrap that opaque value up in a class that releases it on close or __exit__ or __del__ or whatever seems appropriate. One nice way to do this is with @contextmanager:

@contextlib.contextmanager
def freeing(value):
    try:
        yield value
    finally:
        libc.free(value)

所以:

newbuf = libc.strdup(mybuf)
with freeing(newbuf):
    do_stuff(newbuf)
    do_more_stuff(newbuf)
# automatically freed before you get here
# (or even if you don't, because of an exception/return/etc.)

或者:

@contextlib.contextmanager
def strduping(buf):
    value = libc.strdup(buf)
    try:
        yield value
    finally:
        libc.free(value)

现在:

with strduping(mybuf) as newbuf:
    do_stuff(newbuf)
    do_more_stuff(newbuf)
# again, automatically freed here

这篇关于使用外部 C DLL 时 Python 中的内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆