重用列表切片以获取长度是否会花费额外的内存? [英] Does reusing a list slice to get length cost additional memory?

查看:62
本文介绍了重用列表切片以获取长度是否会花费额外的内存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这个答案.马丁·彼得斯(Martijn Pieters)说,我的建议会占用大量内存,他通常是对的,但是我喜欢自己看看事情,所以我尝试介绍一下.这是我得到的:

I proposed a something in a comment in this answer. Martijn Pieters said that my suggestion would be memory intensive, and he's usually right, but I like to see things for myself, so I tried to profile it. Here's what I got:

#!/usr/bin/env python
""" interpolate.py """

from memory_profiler import profile

@profile
def interpolate1(alist):
    length = (1 + len(alist)) // 2
    alist[::2] = [0] * length

@profile
def interpolate2(alist):
    length = len(alist[::2])
    alist[::2] = [0] * length

a = []
b = []
for i in range(5, 9):
    print i
    exp = 10**i
    a[:] = range(exp)
    b[:] = range(exp)
    interpolate1(a)
    interpolate2(b)

我看不到切片解决方案的内存成本有任何增量的差异,但有时我 看到一种算术解决方案.以exp = 7的结果为例,

I don't see any incremental difference in memory cost for the slice solution, but I sometimes see one for the arithmetic solution. Take the results at exp = 7, for example:

7
Filename: interpolate.py

Line #    Mem usage    Increment   Line Contents
================================================
     5    750.1 MiB      0.0 MiB   @profile
     6                             def interpolate1(alist):
     7    750.1 MiB      0.0 MiB       length = (1 + len(alist)) // 2
     8    826.4 MiB     76.3 MiB       alist[::2] = [0] * length


Filename: interpolate.py

Line #    Mem usage    Increment   Line Contents
================================================
    10    826.4 MiB      0.0 MiB   @profile
    11                             def interpolate2(alist):
    12    826.4 MiB      0.0 MiB       length = len(alist[::2])
    13    826.4 MiB      0.0 MiB       alist[::2] = [0] * length

我尝试了其他一些方法来进行性能分析,包括在interpolate2 之前 interpolate1之前运行,随机化运行顺序以及更小的列表,但是结果非常一致.

I tried a few other approaches to profiling, including running interpolate2 before interpolate1, randomizing the run order, and much smaller lists, but the results are pretty consistent.

我可以假设结果是因为无论是在分配的右侧还是在左侧,内存都是为列表切片分配的,无论采用哪种方式进行切片,似乎切片解决方案都可以做到收支平衡算术解决方案.我可以正确解释这些结果吗?

I can postulate that the results are because the memory is being allocated for the list slice either way, whether it's on the right or left side of the assignment, but any way you slice it, it looks like the slice solution breaks even with the arithmetic solution. Am I interpreting these results correctly?

推荐答案

是的,将为仅用于切片创建的新列表对象保留额外的内存.

Yes, additional memory will be reserved for a new list object that is created just for the slice.

但是,查询长度后,再次丢弃了列表对象.您刚刚创建了一个列表对象,只是为了计算一个列表的一半长度.

However, the list object is discarded again after querying the length. You just created a list object just to calculate how long half a list would be.

内存分配相对昂贵,即使您随后再次丢弃该对象也是如此.我要指的是该费用,而您正在寻找永久性的内存占用增加空间.无论列表对象可能是瞬态的,您仍然需要为此对象分配内存.

Memory allocations are relatively expensive, even if you then discard the object again. It is that cost I was referring to, while you are looking for a permanent memory footprint increase. However transient the list object might be, you still needed to allocate memory for this object.

使用timeit比较这两种方法时,成本立即显而易见:

The cost is immediately apparent when you use timeit to compare the two approaches:

>>> import timeit
>>> def calculate(alist):
...     (1 + len(alist)) // 2
... 
>>> def allocate(alist):
...     len(alist[::2])
... 
>>> testlist = range(10**5)
>>> timeit.timeit('f(testlist)', 'from __main__ import testlist, calculate as f', number=10000)
0.003368854522705078
>>> timeit.timeit('f(testlist)', 'from __main__ import testlist, allocate as f', number=10000)
2.7687110900878906

切片仅需创建一个列表对象并跨引用的一半进行复制,但是该操作需要超过800倍的时间 ,只需从现有列表中计算长度即可.

The slice only has to create a list object and copy across half the references, but that operation takes more that 800 times as long as simply calculating the length from the existing list.

请注意,实际上我必须减少timeit重复次数;默认的100万次重复将需要额外的4.5分钟.我不会等那么久,而直接计算只花了0.18秒.

Note that I actually had to reduce the timeit repetition count; the default 1 million repetitions was going to take an additional 4.5 minutes. I wasn't going to wait that long, while the straight calculation took a mere 0.18 seconds.

这篇关于重用列表切片以获取长度是否会花费额外的内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆