Python C API:将PyObjects分配给字典会导致内存泄漏 [英] Python C API: Assigning PyObjects to a dictionary causes memory leak

查看:103
本文介绍了Python C API:将PyObjects分配给字典会导致内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python C API为Python编写C ++包装器.就我而言,我必须使Python脚本可以访问更多的面向字节的数据.为此,我使用PyByteArray_FromStringAndSize方法生成Python字节数组( https: //docs.python.org/2.7/c-api/bytearray.html ).

I am writing a C++ wrapper for Python using the Python C API. In my case I have to make bigger amounts of byte oriented data accessible for the Python script. For this purpose I use the PyByteArray_FromStringAndSize method to produce a Python bytearray (https://docs.python.org/2.7/c-api/bytearray.html).

当直接返回此字节数组时,我没有遇到任何问题.但是,当将字节数组添加到Python字典中时,一旦销毁字典,就不会释放字节数组中的内存.

When returning this bytearray directly I have not experienced any problems. When however adding the bytearray into a Python dict, the memory from the bytearray will not be released once the dict is destroyed.

这可以通过在将bytearray对象添加到Python字典后在bytearray对象上调用Py_DECREF来解决.

This can be solved by calling Py_DECREF on the bytearray object after adding the bytearray object to the Python dict.

下面是我的代码的完整工作示例,其中包含方法dummyArrPlain返回普通字节数组,方法dummyArrInDict返回dict中的字节数组.除非调用Py_DECREF(pyData);,否则第二种方法将产生内存泄漏.

Below is a complete working example of my code containing a method dummyArrPlain returning the plain bytearray and a method dummyArrInDict returning a bytearray in a dict. The second method will produce a memory leak unless Py_DECREF(pyData); is called.

我的问题是:为什么此时需要Py_DECREF.凭直觉,我希望一旦dict被销毁,就应该调用Py_DECREF.

My question is: Why is Py_DECREF necessary at this point. Intuitively I would have expected that Py_DECREF should be called once the dict is destroyed.

我还为字典分配了如下所示的值:

Also I assign values like in the following to a dict:

PyDict_SetItem(dict, PyString_FromString("i"), PyInt_FromLong(i));

在未对创建的字符串长调用Py_DECREF时,这还会导致内存泄漏吗?

Will this also produce a memory leak when not calling Py_DECREF on the created string and long?

这是我的虚拟C ++包装器:

This is my dummy C++ wrapper:

#include <python2.7/Python.h>

static char module_docstring[] = "This is a module causing a memory leak";

static PyObject *dummyArrPlain(PyObject *self, PyObject *args);
static PyObject *dummyArrInDict(PyObject *self, PyObject *args);

static PyMethodDef module_methods[] = {
    {"dummy_arr_plain", dummyArrPlain, METH_VARARGS, "returns a plain dummy bytearray"},
    {"dummy_arr_in_dict", dummyArrInDict, METH_VARARGS, "returns a dummy bytearray in a dict"},
    {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC initlibdummy(void)
{
    PyObject *m = Py_InitModule("libdummy", module_methods);
    if (m == NULL)
        return;
}


static PyObject *dummyArrPlain(PyObject *self, PyObject *args)
{
    int len = 10000000;
    char* data = new char[len];
    for(int i=0; i<len; i++) {
        data[i] = 0;
    }

    PyObject * pyData = PyByteArray_FromStringAndSize(data, len);
    delete [] data;

    return pyData;
}


static PyObject *dummyArrInDict(PyObject *self, PyObject *args)
{
    int len = 10000000;
    char* data = new char[len];
    for(int i=0; i<len; i++) {
        data[i] = 0;
    }
    PyObject * pyData = PyByteArray_FromStringAndSize(data, len);
    delete [] data;

    PyObject *dict = PyDict_New();
    PyDict_SetItem(dict, PyString_FromString("data"), pyData);

    // memory leak without Py_DECREF(pyData);

    return dict;
}

还有一个使用包装程序的虚拟python脚本:

And a dummy python script using the wrapper:

import libdummy
import time

while True:
    a = libdummy.dummy_arr_in_dict()
    time.sleep(0.01)

推荐答案

这是 [Python 2.0.Docs]:所有权规则.我将在 Python 2.7.10 上进行示例(相当老,但我认为行为没有(显着)改变).

It's a matter of [Python 2.0.Docs]: Ownership rules. I'm going to exemplify on Python 2.7.10 (pretty old, but I don't think that the behavior has (significantly) changed along the way).

PyByteArray_FromStringAndSize ( bytearrayobject.c : 168 )创建一个新对象(使用 PyObject_New )并分配内存缓冲区).

PyByteArray_FromStringAndSize (bytearrayobject.c: 168) creates a new object (using PyObject_New, and allocates memory for the buffer as well).

默认情况下,该对象(或更好的是,任何新创建的对象)的 refcount 1 (由 _Py_NewReference 设置),因此,当用户在其上调用 del 时或在程序退出时, refcount 将减小,而当其达到0时,该对象将被释放.

By default, the refcount of that object (or better: of any newly created object) is 1 (set by _Py_NewReference), so that when the user calls del upon it, or at program exit, the refcount will be decreased, and when reaching 0, the object will be deallocated.

  • 这是返回对象的流程上的行为

  • This is the behavior on the flow where the object is returned

但是,在 dummyArrInDict 的情况下, PyDict_SetItem 会(间接)生成 pyData Py_INCREF >(它还有其他功能,但在当前情况下仅与此相关),最后以 2 refcount 结束,因此发生了内存泄漏

But, in dummyArrInDict's case, PyDict_SetItem does (indirectly) a Py_INCREF of pyData (it does other stuff, but only this is relevant in the current situation), ending up with a refcount of 2 and therefore the memory leak

使用 data 基本上是一样的事情:为它分配内存,当不再需要它时,就释放它(这是因为您没有返回它,您只能暂时使用它.)

It's basically same thing that you're doing with data: you allocate memory for it, and when you no longer need it, you free it (this is because you're not returning it, you only use it temporarily).

注意:使用 X 宏(例如

Note: It's safer to use the X macros (e.g. [Python 2.Docs]: Py_XDECREF, especially since you're not testing for NULL the returned PyObjects).

有关更多详细信息,还请参见 [Python 2.Docs] :C API参考.

For more details, also take a look at [Python 2.Docs]: C API Reference.

这篇关于Python C API:将PyObjects分配给字典会导致内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆