为什么元组在内存中的空间比列表少? [英] Why do tuples take less space in memory than lists?

查看:326
本文介绍了为什么元组在内存中的空间比列表少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

tuple在Python中占用更少的内存空间:

A tuple takes less memory space in Python:

>>> a = (1,2,3)
>>> a.__sizeof__()
48

list占用更多的内存空间:

whereas lists takes more memory space:

>>> b = [1,2,3]
>>> b.__sizeof__()
64

Python内存管理内部发生了什么?

What happens internally on the Python memory management?

推荐答案

我假定您使用的是CPython,并且使用64位(在CPython 2.7 64位上得到的结果相同).其他Python实现或如果您使用32位Python,可能会有所不同.

I assume you're using CPython and with 64bits (I got the same results on my CPython 2.7 64-bit). There could be differences in other Python implementations or if you have a 32bit Python.

不管实现如何,list是可变大小的,而tuple是固定大小的.

Regardless of the implementation, lists are variable-sized while tuples are fixed-size.

因此tuple可以将元素直接存储在struct内部,另一方面,列表需要一层间接寻址(它存储指向元素的指针).在64位系统(即64位,因此为8字节)上,此间接层是一个指针.

So tuples can store the elements directly inside the struct, lists on the other hand need a layer of indirection (it stores a pointer to the elements). This layer of indirection is a pointer, on 64bit systems that's 64bit, hence 8bytes.

但是list还有另外一件事:它们过度分配.否则,list.append总是 O(n)操作-使其摊销O(1)(快得多!!!),从而使其过度分配.但是现在,它必须跟踪已分配的大小和已填充的大小(tuple仅需要存储一个大小,因为分配和填充的大小始终相同) .这意味着每个列表必须存储另一个大小",在64位系统上,该大小"是64位整数,也是8个字节.

But there's another thing that lists do: They over-allocate. Otherwise list.append would be an O(n) operation always - to make it amortized O(1) (much faster!!!) it over-allocates. But now it has to keep track of the allocated size and the filled size (tuples only need to store one size, because allocated and filled size are always identical). That means each list has to store another "size" which on 64bit systems is a 64bit integer, again 8 bytes.

因此,与tuple相比,list至少需要多16个字节的内存.为什么我说至少"?由于分配过多.过度分配意味着它分配了比所需更多的空间.但是,过度分配的数量取决于创建列表的方式"和附加/删除历史记录:

So lists need at least 16 bytes more memory than tuples. Why did I say "at least"? Because of the over-allocation. Over-allocation means it allocates more space than needed. However, the amount of over-allocation depends on "how" you create the list and the append/deletion history:

>>> l = [1,2,3]
>>> l.__sizeof__()
64
>>> l.append(4)  # triggers re-allocation (with over-allocation), because the original list is full
>>> l.__sizeof__()
96

>>> l = []
>>> l.__sizeof__()
40
>>> l.append(1)  # re-allocation with over-allocation
>>> l.__sizeof__()
72
>>> l.append(2)  # no re-alloc
>>> l.append(3)  # no re-alloc
>>> l.__sizeof__()
72
>>> l.append(4)  # still has room, so no over-allocation needed (yet)
>>> l.__sizeof__()
72

图片

我决定创建一些图像以伴随以上说明.也许这些很有帮助

Images

I decided to create some images to accompany the explanation above. Maybe these are helpful

在示例中,这是(示意性地)将其存储在内存中的方式.我强调了红色(徒手)循环的区别:

This is how it (schematically) is stored in memory in your example. I highlighted the differences with red (free-hand) cycles:

这实际上只是一个近似值,因为int对象也是Python对象,并且CPython甚至重用了小整数,因此内存中对象的一种可能更准确的表示形式(尽管不那么可读)将是:

That's actually just an approximation because int objects are also Python objects and CPython even reuses small integers, so a probably more accurate representation (although not as readable) of the objects in memory would be:

有用的链接:

  • tuple struct in CPython repository for Python 2.7
  • list struct in CPython repository for Python 2.7
  • int struct in CPython repository for Python 2.7

请注意,__sizeof__并不会真正返回正确"的大小!它仅返回存储值的大小.但是,当您使用 sys.getsizeof 时,结果将有所不同:

Note that __sizeof__ doesn't really return the "correct" size! It only returns the size of the stored values. However when you use sys.getsizeof the result is different:

>>> import sys
>>> l = [1,2,3]
>>> t = (1, 2, 3)
>>> sys.getsizeof(l)
88
>>> sys.getsizeof(t)
72

有24个额外"字节.这些是真实的,这是__sizeof__方法未解决的垃圾收集器开销.那是因为通常不应该直接使用魔术方法-在这种情况下,使用知道如何处理魔术方法的函数:

There are 24 "extra" bytes. These are real, that's the garbage collector overhead that isn't accounted for in the __sizeof__ method. That's because you're generally not supposed to use magic methods directly - use the functions that know how to handle them, in this case: sys.getsizeof (which actually adds the GC overhead to the value returned from __sizeof__).

这篇关于为什么元组在内存中的空间比列表少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆