内存中列表的大小 [英] Size of list in memory

查看:31
本文介绍了内存中列表的大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚尝试了内存中 python 数据结构的大小.我写了以下片段:

导入系统lst1=[]lst1.append(1)lst2=[1]打印(sys.getsizeof(lst1),sys.getsizeof(lst2))

我在以下配置上测试了代码:

  • Windows 7 64 位,Python3.1:输出为:52 40 所以 lst1 有 52 个字节,lst2 有 40 个字节.
  • Ubuntu 11.4 32bit with Python3.2:输出为 48 32
  • Ubuntu 11.4 32 位 Python2.7:48 36

谁能向我解释为什么这两种尺寸不同,尽管它们都是包含 1 的列表?

在 getsizeof 函数的 Python 文档中,我发现了以下内容:...如果对象由垃圾收集器管理,则会增加额外的垃圾收集器开销. 这可能是我的情况吗?例子?

解决方案

这是一个更完整的交互式会话,它将帮助我解释发生了什么(Windows XP 32 位上的 Python 2.6,但实际上并不重要):

<预><代码>>>>导入系统>>>sys.getsizeof([])36>>>sys.getsizeof([1])40>>>lst = []>>>lst.append(1)>>>sys.getsizeof(lst)52>>>

请注意,空列表比其中包含 [1] 的列表要小一些.然而,当添加一个元素时,它会变得更大.

这是因为 CPython 源代码中 Objects/listobject.c 中的实现细节.

空列表

当创建一个空列表 [] 时,不会为元素分配空间 - 这可以在 PyList_New 中看到.36 字节是 32 位机器上列表数据结构本身所需的空间量.

一个元素的列表

当创建具有单个元素 [1] 的列表时,除了列表数据结构本身所需的内存之外,还会为一个元素分配空间.同样,这可以在 PyList_New 中找到.给定 size 作为参数,它计算:

nbytes = size * sizeof(PyObject *);

然后有:

if (size <= 0)op->ob_item = NULL;别的 {op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);如果(操作-> ob_item == NULL){Py_DECREF(op);返回 PyErr_NoMemory();}memset(op->ob_item, 0, nbytes);}Py_SIZE(op) = 大小;op->allocated = size;

所以我们看到,当 size = 1 时,分配了一个指针的空间.4 个字节(在我的 32 位机器上).

附加到空列表

在空列表上调用 append 时,会发生以下情况:

  • PyList_Append 调用 app1
  • app1 询问列表的大小(得到 0 作为答案)
  • app1 然后使用 size+1(在我们的例子中为 1)调用 list_resize
  • list_resize 有一个有趣的分配策略,在此评论中对其来源进行了总结.

这是:

/* 这与列表大小成比例地过度分配,腾出空间* 用于额外增长.过度分配是轻微的,但* 足以在很长一段时间内提供线性时间摊销行为* appends() 序列在表现不佳的情况下* 系统重新分配().* 增长模式为:0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...*/new_allocated = (newsize >> 3) + (newsize <9 ? 3 : 6);/* 检查整数溢出 */if (new_allocated > PY_SIZE_MAX - newsize) {PyErr_NoMemory();返回-1;} 别的 {new_allocated += newsize;}

让我们做一些数学运算

让我们看看我在文章开头的会话中引用的数字是如何达到的.

所以 36 字节是 32 位列表数据结构本身所需的大小.对于单个元素,为一个指针分配空间,因此有 4 个额外字节 - 总共 40 个字节.到目前为止还好.

当在一个空列表上调用 app1 时,它会调用 list_resizesize=1.根据list_resize的超额分配算法,1之后的下一个最大可用大小为4,因此将分配4个指针的位置.4 * 4 = 16 字节,36 + 16 = 52.

确实,一切都有意义:-)

I just experimented with the size of python data structures in memory. I wrote the following snippet:

import sys
lst1=[]
lst1.append(1)
lst2=[1]
print(sys.getsizeof(lst1), sys.getsizeof(lst2))

I tested the code on the following configurations:

  • Windows 7 64bit, Python3.1: the output is: 52 40 so lst1 has 52 bytes and lst2 has 40 bytes.
  • Ubuntu 11.4 32bit with Python3.2: output is 48 32
  • Ubuntu 11.4 32bit Python2.7: 48 36

Can anyone explain to me why the two sizes differ although both are lists containing a 1?

In the python documentation for the getsizeof function I found the following: ...adds an additional garbage collector overhead if the object is managed by the garbage collector. Could this be the case in my little example?

解决方案

Here's a fuller interactive session that will help me explain what's going on (Python 2.6 on Windows XP 32-bit, but it doesn't matter really):

>>> import sys
>>> sys.getsizeof([])
36
>>> sys.getsizeof([1])
40
>>> lst = []
>>> lst.append(1)
>>> sys.getsizeof(lst)
52
>>> 

Note that the empty list is a bit smaller than the one with [1] in it. When an element is appended, however, it grows much larger.

The reason for this is the implementation details in Objects/listobject.c, in the source of CPython.

Empty list

When an empty list [] is created, no space for elements is allocated - this can be seen in PyList_New. 36 bytes is the amount of space required for the list data structure itself on a 32-bit machine.

List with one element

When a list with a single element [1] is created, space for one element is allocated in addition to the memory required by the list data structure itself. Again, this can be found in PyList_New. Given size as argument, it computes:

nbytes = size * sizeof(PyObject *);

And then has:

if (size <= 0)
    op->ob_item = NULL;
else {
    op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);
    if (op->ob_item == NULL) {
        Py_DECREF(op);
        return PyErr_NoMemory();
    }
    memset(op->ob_item, 0, nbytes);
}
Py_SIZE(op) = size;
op->allocated = size;

So we see that with size = 1, space for one pointer is allocated. 4 bytes (on my 32-bit box).

Appending to an empty list

When calling append on an empty list, here's what happens:

  • PyList_Append calls app1
  • app1 asks for the list's size (and gets 0 as an answer)
  • app1 then calls list_resize with size+1 (1 in our case)
  • list_resize has an interesting allocation strategy, summarized in this comment from its source.

Here it is:

/* This over-allocates proportional to the list size, making room
* for additional growth.  The over-allocation is mild, but is
* enough to give linear-time amortized behavior over a long
* sequence of appends() in the presence of a poorly-performing
* system realloc().
* The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

/* check for integer overflow */
if (new_allocated > PY_SIZE_MAX - newsize) {
    PyErr_NoMemory();
    return -1;
} else {
    new_allocated += newsize;
}

Let's do some math

Let's see how the numbers I quoted in the session in the beginning of my article are reached.

So 36 bytes is the size required by the list data structure itself on 32-bit. With a single element, space is allocated for one pointer, so that's 4 extra bytes - total 40 bytes. OK so far.

When app1 is called on an empty list, it calls list_resize with size=1. According to the over-allocation algorithm of list_resize, the next largest available size after 1 is 4, so place for 4 pointers will be allocated. 4 * 4 = 16 bytes, and 36 + 16 = 52.

Indeed, everything makes sense :-)

这篇关于内存中列表的大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆