内存中列表的大小 [英] Size of list in memory
问题描述
我刚刚尝试了内存中 python 数据结构的大小.我写了以下片段:
导入系统lst1=[]lst1.append(1)lst2=[1]打印(sys.getsizeof(lst1),sys.getsizeof(lst2))
我在以下配置上测试了代码:
- Windows 7 64 位,Python3.1:输出为:
52 40
所以 lst1 有 52 个字节,lst2 有 40 个字节. - Ubuntu 11.4 32bit with Python3.2:输出为
48 32
- Ubuntu 11.4 32 位 Python2.7:
48 36
谁能向我解释为什么这两种尺寸不同,尽管它们都是包含 1 的列表?
在 getsizeof 函数的 Python 文档中,我发现了以下内容:...如果对象由垃圾收集器管理,则会增加额外的垃圾收集器开销.
这可能是我的情况吗?例子?
这是一个更完整的交互式会话,它将帮助我解释发生了什么(Windows XP 32 位上的 Python 2.6,但实际上并不重要):
<预><代码>>>>导入系统>>>sys.getsizeof([])36>>>sys.getsizeof([1])40>>>lst = []>>>lst.append(1)>>>sys.getsizeof(lst)52>>>请注意,空列表比其中包含 [1]
的列表要小一些.然而,当添加一个元素时,它会变得更大.
这是因为 CPython 源代码中 Objects/listobject.c
中的实现细节.
空列表
当创建一个空列表 []
时,不会为元素分配空间 - 这可以在 PyList_New
中看到.36 字节是 32 位机器上列表数据结构本身所需的空间量.
一个元素的列表
当创建具有单个元素 [1]
的列表时,除了列表数据结构本身所需的内存之外,还会为一个元素分配空间.同样,这可以在 PyList_New
中找到.给定 size
作为参数,它计算:
nbytes = size * sizeof(PyObject *);
然后有:
if (size <= 0)op->ob_item = NULL;别的 {op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);如果(操作-> ob_item == NULL){Py_DECREF(op);返回 PyErr_NoMemory();}memset(op->ob_item, 0, nbytes);}Py_SIZE(op) = 大小;op->allocated = size;
所以我们看到,当 size = 1
时,分配了一个指针的空间.4 个字节(在我的 32 位机器上).
附加到空列表
在空列表上调用 append
时,会发生以下情况:
PyList_Append
调用app1
app1
询问列表的大小(得到 0 作为答案)app1
然后使用size+1
(在我们的例子中为 1)调用list_resize
list_resize
有一个有趣的分配策略,在此评论中对其来源进行了总结.
这是:
/* 这与列表大小成比例地过度分配,腾出空间* 用于额外增长.过度分配是轻微的,但* 足以在很长一段时间内提供线性时间摊销行为* appends() 序列在表现不佳的情况下* 系统重新分配().* 增长模式为:0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...*/new_allocated = (newsize >> 3) + (newsize <9 ? 3 : 6);/* 检查整数溢出 */if (new_allocated > PY_SIZE_MAX - newsize) {PyErr_NoMemory();返回-1;} 别的 {new_allocated += newsize;}
让我们做一些数学运算
让我们看看我在文章开头的会话中引用的数字是如何达到的.
所以 36 字节是 32 位列表数据结构本身所需的大小.对于单个元素,为一个指针分配空间,因此有 4 个额外字节 - 总共 40 个字节.到目前为止还好.
当在一个空列表上调用 app1
时,它会调用 list_resize
和 size=1
.根据list_resize
的超额分配算法,1之后的下一个最大可用大小为4,因此将分配4个指针的位置.4 * 4 = 16 字节,36 + 16 = 52.
确实,一切都有意义:-)
I just experimented with the size of python data structures in memory. I wrote the following snippet:
import sys
lst1=[]
lst1.append(1)
lst2=[1]
print(sys.getsizeof(lst1), sys.getsizeof(lst2))
I tested the code on the following configurations:
- Windows 7 64bit, Python3.1: the output is:
52 40
so lst1 has 52 bytes and lst2 has 40 bytes. - Ubuntu 11.4 32bit with Python3.2: output is
48 32
- Ubuntu 11.4 32bit Python2.7:
48 36
Can anyone explain to me why the two sizes differ although both are lists containing a 1?
In the python documentation for the getsizeof function I found the following: ...adds an additional garbage collector overhead if the object is managed by the garbage collector.
Could this be the case in my little example?
Here's a fuller interactive session that will help me explain what's going on (Python 2.6 on Windows XP 32-bit, but it doesn't matter really):
>>> import sys
>>> sys.getsizeof([])
36
>>> sys.getsizeof([1])
40
>>> lst = []
>>> lst.append(1)
>>> sys.getsizeof(lst)
52
>>>
Note that the empty list is a bit smaller than the one with [1]
in it. When an element is appended, however, it grows much larger.
The reason for this is the implementation details in Objects/listobject.c
, in the source of CPython.
Empty list
When an empty list []
is created, no space for elements is allocated - this can be seen in PyList_New
. 36 bytes is the amount of space required for the list data structure itself on a 32-bit machine.
List with one element
When a list with a single element [1]
is created, space for one element is allocated in addition to the memory required by the list data structure itself. Again, this can be found in PyList_New
. Given size
as argument, it computes:
nbytes = size * sizeof(PyObject *);
And then has:
if (size <= 0)
op->ob_item = NULL;
else {
op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);
if (op->ob_item == NULL) {
Py_DECREF(op);
return PyErr_NoMemory();
}
memset(op->ob_item, 0, nbytes);
}
Py_SIZE(op) = size;
op->allocated = size;
So we see that with size = 1
, space for one pointer is allocated. 4 bytes (on my 32-bit box).
Appending to an empty list
When calling append
on an empty list, here's what happens:
PyList_Append
callsapp1
app1
asks for the list's size (and gets 0 as an answer)app1
then callslist_resize
withsize+1
(1 in our case)list_resize
has an interesting allocation strategy, summarized in this comment from its source.
Here it is:
/* This over-allocates proportional to the list size, making room
* for additional growth. The over-allocation is mild, but is
* enough to give linear-time amortized behavior over a long
* sequence of appends() in the presence of a poorly-performing
* system realloc().
* The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
/* check for integer overflow */
if (new_allocated > PY_SIZE_MAX - newsize) {
PyErr_NoMemory();
return -1;
} else {
new_allocated += newsize;
}
Let's do some math
Let's see how the numbers I quoted in the session in the beginning of my article are reached.
So 36 bytes is the size required by the list data structure itself on 32-bit. With a single element, space is allocated for one pointer, so that's 4 extra bytes - total 40 bytes. OK so far.
When app1
is called on an empty list, it calls list_resize
with size=1
. According to the over-allocation algorithm of list_resize
, the next largest available size after 1 is 4, so place for 4 pointers will be allocated. 4 * 4 = 16 bytes, and 36 + 16 = 52.
Indeed, everything makes sense :-)
这篇关于内存中列表的大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!