list() 使用比列表推导略多的内存 [英] list() uses slightly more memory than list comprehension

查看:16
本文介绍了list() 使用比列表推导略多的内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我在玩 list 对象时发现了一些奇怪的事情,如果 list 是用 list() 创建的,它会使用更多的内存,比列表理解?我使用的是 Python 3.5.2

在[1]中:导入系统在 [2] 中:a = list(range(100))在 [3] 中:sys.getsizeof(a)出[3]:1008在 [4] 中:b = [i for i in range(100)]在 [5] 中:sys.getsizeof(b)出[5]:912在 [6] 中:type(a) == type(b)输出[6]:真在 [7] 中:a == b输出[7]:真在 [8]: sys.getsizeof(list(b))出[8]:1008

来自

为什么会这样?

更新 #1

使用 Python 3.6.0b2 进行测试:

Python 3.6.0b2(默认,2016 年 10 月 11 日,11:52:53)[GCC 5.4.0 20160609] 在 linux 上输入帮助"、版权"、信用"或许可证"以获取更多信息.>>>导入系统>>>sys.getsizeof(list(range(100)))1008>>>sys.getsizeof([i for i in range(100)])912

更新 #2

使用 Python 2.7.12 进行测试:

Python 2.7.12(默认,2016 年 7 月 1 日,15:12:24)[GCC 5.4.0 20160609] 在 linux2输入帮助"、版权"、信用"或许可证"以获取更多信息.>>>导入系统>>>sys.getsizeof(list(xrange(100)))1016>>>sys.getsizeof([i for i in xrange(100)])920

解决方案

我认为您看到的是过度分配模式,这是一个 来自源的样本:

/* 这与列表大小成比例地过度分配,腾出空间* 用于额外增长.过度分配是轻微的,但* 足以在很长一段时间内提供线性时间摊销行为* 在表现不佳的情况下的 appends() 序列* 系统重新分配().* 增长模式为:0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...*/new_allocated = (newsize >> 3) + (newsize <9 ? 3 : 6);

<小时>

打印长度为 0-88 的列表推导式的大小,您可以看到模式匹配:

# 为 0-88 的尺寸创建理解理解 = [sys.getsizeof([1 for _ in range(l)]) for l in range(90)]# 只取那些导致与之前长度相比增长的步骤= zip(理解,理解[1:])增长 = [x for x in list(enumerate(steps)) if x[1][0] != x[1][1]]# 打印结果:对于增长的增长:打印(增长)

结果(格式为(列表长度,(旧总大小,新总大小))):

(0, (64, 96))(4, (96, 128))(8, (128, 192))(16, (192, 264))(25, (264, 344))(35, (344, 432))(46, (432, 528))(58, (528, 640))(72, (640, 768))(88, (768, 912))

<小时>

过度分配是出于性能原因,允许列表增长而不会随着每次增长分配更多内存(更好的摊销业绩).

与使用列表推导不同的一个可能原因是,列表推导无法确定性地计算生成列表的大小,但 list() 可以.这意味着理解将在使用过度分配填充列表时不断增长列表,直到最终填充它.

一旦完成,有可能不会增加具有未使用分配节点的过度分配缓冲区(实际上,在大多数情况下不会,这会破坏过度分配的目的).

list() 但是,无论列表大小如何,它都可以添加一些缓冲区,因为它提前知道最终列表的大小.

<小时>

同样来自源头的另一个支持证据是我们看到 list comprehensions 调用 LIST_APPEND,表示使用了 list.resize,这反过来表示在不知道有多少的情况下消耗了预分配缓冲区填充.这与您看到的行为一致.

<小时>

总而言之,list() 将根据列表大小预先分配更多节点

<预><代码>>>>sys.getsizeof(list([1,2,3]))60>>>sys.getsizeof(list([1,2,3,4]))64

列表推导不知道列表的大小,因此它在增长时使用附加操作,耗尽预分配缓冲区:

# 完全填充预分配缓冲区之前的一项>>>sys.getsizeof([i for i in [1,2,3]])52# 完全填充预分配缓冲区# 注意大小没有改变,我们仍然缓冲了未使用的节点>>>sys.getsizeof([i for i in [1,2,3,4]])52# 增加预分配缓冲区>>>sys.getsizeof([i for i in [1,2,3,4,5]])68

So i was playing with list objects and found little strange thing that if list is created with list() it uses more memory, than list comprehension? I'm using Python 3.5.2

In [1]: import sys
In [2]: a = list(range(100))
In [3]: sys.getsizeof(a)
Out[3]: 1008
In [4]: b = [i for i in range(100)]
In [5]: sys.getsizeof(b)
Out[5]: 912
In [6]: type(a) == type(b)
Out[6]: True
In [7]: a == b
Out[7]: True
In [8]: sys.getsizeof(list(b))
Out[8]: 1008

From the docs:

Lists may be constructed in several ways:

  • Using a pair of square brackets to denote the empty list: []
  • Using square brackets, separating items with commas: [a], [a, b, c]
  • Using a list comprehension: [x for x in iterable]
  • Using the type constructor: list() or list(iterable)

But it seems that using list() it uses more memory.

And as much list is bigger, the gap increases.

Why this happens?

UPDATE #1

Test with Python 3.6.0b2:

Python 3.6.0b2 (default, Oct 11 2016, 11:52:53) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getsizeof(list(range(100)))
1008
>>> sys.getsizeof([i for i in range(100)])
912

UPDATE #2

Test with Python 2.7.12:

Python 2.7.12 (default, Jul  1 2016, 15:12:24) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getsizeof(list(xrange(100)))
1016
>>> sys.getsizeof([i for i in xrange(100)])
920

解决方案

I think you're seeing over-allocation patterns this is a sample from the source:

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 */

new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);


Printing the sizes of list comprehensions of lengths 0-88 you can see the pattern matches:

# create comprehensions for sizes 0-88
comprehensions = [sys.getsizeof([1 for _ in range(l)]) for l in range(90)]

# only take those that resulted in growth compared to previous length
steps = zip(comprehensions, comprehensions[1:])
growths = [x for x in list(enumerate(steps)) if x[1][0] != x[1][1]]

# print the results:
for growth in growths:
    print(growth)

Results (format is (list length, (old total size, new total size))):

(0, (64, 96)) 
(4, (96, 128))
(8, (128, 192))
(16, (192, 264))
(25, (264, 344))
(35, (344, 432))
(46, (432, 528))
(58, (528, 640))
(72, (640, 768))
(88, (768, 912))


The over-allocation is done for performance reasons allowing lists to grow without allocating more memory with every growth (better amortized performance).

A probable reason for the difference with using list comprehension, is that list comprehension can not deterministically calculate the size of the generated list, but list() can. This means comprehensions will continuously grow the list as it fills it using over-allocation until finally filling it.

It is possible that is will not grow the over-allocation buffer with unused allocated nodes once its done (in fact, in most cases it wont, that would defeat the over-allocation purpose).

list(), however, can add some buffer no matter the list size since it knows the final list size in advance.


Another backing evidence, also from the source, is that we see list comprehensions invoking LIST_APPEND, which indicates usage of list.resize, which in turn indicates consuming the pre-allocation buffer without knowing how much of it will be filled. This is consistent with the behavior you're seeing.


To conclude, list() will pre-allocate more nodes as a function of the list size

>>> sys.getsizeof(list([1,2,3]))
60
>>> sys.getsizeof(list([1,2,3,4]))
64

List comprehension does not know the list size so it uses append operations as it grows, depleting the pre-allocation buffer:

# one item before filling pre-allocation buffer completely
>>> sys.getsizeof([i for i in [1,2,3]]) 
52
# fills pre-allocation buffer completely
# note that size did not change, we still have buffered unused nodes
>>> sys.getsizeof([i for i in [1,2,3,4]]) 
52
# grows pre-allocation buffer
>>> sys.getsizeof([i for i in [1,2,3,4,5]])
68

这篇关于list() 使用比列表推导略多的内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆