列表和Python中设置的内存消耗 [英] Memory consumption of a list and set in Python
问题描述
>>> from sys import getsizeof
>>> a=[i for i in range(1000)]
>>> b={i for i in range(1000)}
>>> getsizeof(a)
9024
>>> getsizeof(b)
32992
我的问题是,为什么一个集合比一个列表消耗更多的内存?列表是有序的,集合不是.是消耗内存的集合的内部结构吗?还是列表包含指针而set不包含指针?也许sys.getsizeof
在这里是错误的?我见过有关元组,列表和字典的问题,但找不到列表和集合之间的任何比较.
My question is, why does a set consume so much more memory compared to a list? Lists are ordered, sets are not. Is it an internal structure of a set that consumes memory? Or does a list contain pointers and set does not? Or maybe sys.getsizeof
is wrong here? I've seen questions about tuples, lists and dictionaries, but I could not find any comparison between lists and sets.
推荐答案
我认为这是由于list
和set
或dict
之间的固有差异,即元素的存储方式.
I think it's because of the inherent difference between list
and set
or dict
i.e. the way in which the elements are stored.
List
只是对原始对象的引用的集合.假设您创建1000个整数,然后创建1000个整数对象,并且list
仅包含对这些对象的引用.
List
is nothing but a collection of references to the original object. Suppose you create 1000 integers, then 1000 integer objects are created and the list
only contains the reference to these objects.
另一方面,set
或dictionary
必须计算这1000个整数的哈希值,并且根据元素数消耗内存.
On the other hand, set
or dictionary
has to compute the hash value for these 1000 integers and the memory is consumed according to the number of elements.
例如:在set
和dict
中,默认情况下,最小大小为8(也就是说,如果仅存储3个值,python仍将分配8个元素).调整大小时,存储桶数增加4倍,直到达到50,000个元素,此后大小增加2倍.这样可以提供以下尺寸,
For ex: In both set
and dict
, by default, the smallest size is 8 (that is, if you are only storing 3 values, python will still allocate 8 elements). On resize, the number of buckets increases by 4x until we reach 50,000 elements, after which the size is increased by 2x. This gives the following possible sizes,
16、64、256、1024、4096、16384、65536、131072、262144,...
一些例子:
In [26]: a=[i for i in range(60000)]
In [27]: b={i for i in range(60000)}
In [30]: b1={i for i in range(100000)}
In [31]: a1=[i for i in range(100000)]
In [32]: getsizeof(a)
Out[32]: 514568
In [33]: getsizeof(b)
Out[33]: 2097376
In [34]: getsizeof(a1)
Out[34]: 824464
In [35]: getsizeof(b1)
Out[35]: 4194528
答案:
是的,这是set
存储元素的内部结构消耗的内存量.并且,sys.getsizeof
仅是正确的;在这里使用它没有错.
Answers:
Yes, it's the internal structure in the way set
stores the elements consumes this much memory. And, sys.getsizeof
is correct only; There's nothing wrong with using that here.
For more detailed reference about list
, set
or dict
please refer this chapter: High Performance Python
这篇关于列表和Python中设置的内存消耗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!