在 Python 中切片列表而不生成副本 [英] Slicing a list in Python without generating a copy

查看:26
本文介绍了在 Python 中切片列表而不生成副本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题.

<块引用>

给定一个整数列表L,我需要生成所有的子列表L[k:] for k in [0, len(L)- 1]不生成副本.

如何在 Python 中完成此操作?以某种方式使用缓冲区对象?

解决方案

简答

切片列表不会生成列表中对象的副本;它只是复制对它们的引用.这就是所问问题的答案.

长答案

测试可变和不可变值

首先,让我们测试一下基本声明.我们可以证明,即使在像整数这样的不可变对象的情况下,也只复制引用.这是三个不同的整数对象,每个对象都具有相同的值:

<预><代码>>>>a = [1000 + 1, 1000 + 1, 1000 + 1]

它们具有相同的值,但您可以看到它们是三个不同的对象,因为它们具有不同的 id:

<预><代码>>>>地图(ID,一)[140502922988976、140502922988952、140502922988928]

当你对它们进行切片时,引用保持不变.没有创建新对象:

<预><代码>>>>b = a[1:3]>>>地图(ID,B)[140502922988952、140502922988928]

使用具有相同值的不同对象表明复制过程不会打扰实习 -- 它只是直接复制引用.

使用可变值进行测试给出了相同的结果:

<预><代码>>>>a = [{0: '零', 1: '一'}, ['foo', 'bar']]>>>地图(ID,一)[4380777000, 4380712040]>>>地图(ID,一个[1:]……)[4380712040]

检查剩余内存开销

当然,引用他们自己是被复制的.在 64 位机器上,每一个花费 8 个字节.每个列表都有自己的 72 字节内存开销:

<预><代码>>>>对于范围内的 i(len(a)):... x = a[:i]... print('len: {}'.format(len(x)))... print('size: {}'.format(sys.getsizeof(x)))...长度:0尺寸:72长度:1尺寸:80长度:2尺寸:88

作为 Joe Pinsonault 提醒我们,开销加起来.整数对象本身并不是很大——它们比引用大三倍.因此,这在绝对意义上为您节省了一些内存,但渐近地,能够将多个作为视图"的列表放入同一内存中可能会很好.

使用视图节省内存

不幸的是,Python 没有提供简单的方法来将视图"对象生成到列表中.或者我应该说幸运"!这意味着您不必担心切片的来源;对原始文件的更改不会影响切片.总的来说,这使得对程序行为的推理变得更加容易.

如果您真的想通过使用视图来节省内存,请考虑使用 numpy 数组.当你对 numpy 数组进行切片时,内存在切片和原始数组之间共享:

<预><代码>>>>a = numpy.arange(3)>>>一种数组([0, 1, 2])>>>b = a[1:3]>>>乙数组([1, 2])

当我们修改a并再次查看b会发生什么?

<预><代码>>>>[2] = 1001>>>乙数组([ 1, 1001])

但这意味着您必须确保在修改一个对象时不会无意中修改另一个对象.这就是使用 numpy 时的权衡:计算机的工作量减少,程序员的工作量增加!

I have the following problem.

Given a list of integers L, I need to generate all of the sublists L[k:] for k in [0, len(L) - 1], without generating copies.

How do I accomplish this in Python? With a buffer object somehow?

解决方案

The short answer

Slicing lists does not generate copies of the objects in the list; it just copies the references to them. That is the answer to the question as asked.

The long answer

Testing on mutable and immutable values

First, let's test the basic claim. We can show that even in the case of immutable objects like integers, only the reference is copied. Here are three different integer objects, each with the same value:

>>> a = [1000 + 1, 1000 + 1, 1000 + 1]

They have the same value, but you can see they are three distinct objects because they have different ids:

>>> map(id, a)
[140502922988976, 140502922988952, 140502922988928]

When you slice them, the references remain the same. No new objects have been created:

>>> b = a[1:3]
>>> map(id, b)
[140502922988952, 140502922988928]

Using different objects with the same value shows that the copy process doesn't bother with interning -- it just directly copies the references.

Testing with mutable values gives the same result:

>>> a = [{0: 'zero', 1: 'one'}, ['foo', 'bar']]
>>> map(id, a)
[4380777000, 4380712040]
>>> map(id, a[1:]
... )
[4380712040]

Examining remaining memory overhead

Of course the references themselves are copied. Each one costs 8 bytes on a 64-bit machine. And each list has its own memory overhead of 72 bytes:

>>> for i in range(len(a)):
...     x = a[:i]
...     print('len: {}'.format(len(x)))
...     print('size: {}'.format(sys.getsizeof(x)))
... 
len: 0
size: 72
len: 1
size: 80
len: 2
size: 88

As Joe Pinsonault reminds us, that overhead adds up. And integer objects themselves are not very large -- they are three times larger than references. So this saves you some memory in an absolute sense, but asymptotically, it might be nice to be able to have multiple lists that are "views" into the same memory.

Saving memory by using views

Unfortunately, Python provides no easy way to produce objects that are "views" into lists. Or perhaps I should say "fortunately"! It means you don't have to worry about where a slice comes from; changes to the original won't affect the slice. Overall, that makes reasoning about a program's behavior much easier.

If you really want to save memory by working with views, consider using numpy arrays. When you slice a numpy array, the memory is shared between the slice and the original:

>>> a = numpy.arange(3)
>>> a
array([0, 1, 2])
>>> b = a[1:3]
>>> b
array([1, 2])

What happens when we modify a and look again at b?

>>> a[2] = 1001
>>> b
array([   1, 1001])

But this means you have to be sure that when you modify one object, you aren't inadvertently modifying another. That's the trade-off when you use numpy: less work for the computer, and more work for the programmer!

这篇关于在 Python 中切片列表而不生成副本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆