Python中的堆栈/列表-如何追加? [英] Stacks / list in python - how does it append?

查看:81
本文介绍了Python中的堆栈/列表-如何追加?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一个列表:

list_1 = ["apples", "apricots", "oranges"]

然后我将一个新项目添加到列表中:浆果"

and I append an new item to the list : "berries"

list_1 = ["apples", "apricots", "oranges", "berries"]

(实际上),我想我记得我读过Python创建了另一个列表(列表_2)并将其指向原始列表(列表_1),以便列表_1保持静态...如果是这样,看起来会是这样吗(内幕)?

Under-the-hood (so to speak), I thought I remember reading that Python creates another list (list_2) and points it to the original list (list_1) so that list_1 remains static...if this is true, would it look something like this (under-the-hood)?

list_1 = ["apples", "apricots", ["oranges", "berries"]]

因此,以这种方式,原始列表将保持其大小.这是正确的观察方式吗?

So in this way, the original list maintains its size. Is this the correct way of looking at it?

推荐答案

否,当您调用 append 时,Python不会创建另一个列表.它就地改变现有列表.您可以很容易地看到这一点:

No, Python does not create another list when you call append. It mutates the existing list in-place. You can see this pretty easily:

>>> lst1 = []
>>> lst2 = lst1
>>> lst1.append(0)
>>> lst1
[0]
>>> lst2
[0]

如果您要创建另一个列表,则可以执行以下操作:

If you want to create another list, you can do this instead:

>>> lst1 = []
>>> lst2 = lst1
>>> lst1 = lst1 + [0]
>>> lst1
[0]
>>> lst2
[]


那么,就地附加功能如何工作?难道不只是在幕后列出数组吗?是的,他们是.Python的末尾留有一些空间,但是如果您 append 足够多的时间,它必须为该列表分配一个新数组,移至所有元素上,然后删除旧元素.它仍然是相同的列表对象,但内部具有不同的数组.


So, how does that in-place appending work? Aren't lists just arrays under the hood? Yes, they are. Python leaves a little space at the end, but if you append enough times, it has to allocate a new array for the list, move over all the elements, and delete the old one. It's still the same list object, but with a different array under the hood.

这种增长并不只是每次都添加一个新的广告位-这意味着每个 append 必须重新分配整个列表,因此附加将花费平均线性时间.相反,它乘以长度.像这样:

That growing doesn't just add one new slot each time—that would mean each append has to reallocate the whole list, so appending would take average linear time. Instead, it multiplies the length. Something like this:

new_capacity = max(4, capacity * 8 // 5, new_length)

( new_length 在其中,以防您一次用一大堆元素扩展扩展列表.)

(The new_length is there in case you're extending the list with a whole bunch of elements at once.)

通过几何扩展而不是算术扩展,我们可以保证,尽管一些 append 确实花费了线性时间,但其中的足够多的时间使得摊销时间是恒定的.确切地讲,您使用的因素是速度(较高的数字表示较少的重新分配)和空间(较高的数字表示末端有更多的浪费空间)之间的折衷.我不知道CPython的作用,但是您可以在下面链接的源代码中找到它.大多数系统使用的值在1.5到2.0之间(通常是小数的一部分,因此它们可以进行整数乘除).

By expanding geometrically rather than arithmetically, we can guarantee that, while a few appends do take linear time, enough of them are instant that the amortized time is constant. Exactly what factor you use is a tradeoff between speed (high numbers mean fewer reallocations) and space (higher numbers mean more wasted space on the end). I don't know what CPython does, but you can find it in the source code linked below. Most systems use a value between 1.5 and 2.0 (and usually a nice fraction of small numbers so they can do integer multiple and divide).

如果您真的想了解这一点,并且可以使用基本的C语言,则可以在

If you really want to understand this, and you can follow basic C, you can look under the hood at listobject.h and listobject.c. You'll probably want to read the C API docs first, but here's the basics (in Python-like pseudocode, and intentionally using not quite the real function and field names):

if lst.size + 1 > lst.allocated:
    new_capacity = <see above>
    lst.array = PyRealloc(<enough memory for new_capacity pointers>)
    lst.allocated = new_capacity
incref(new_item)
lst.array[lst.size] = new_item
lst.size += 1

Realloc 函数将成为该平台函数的一个瘦包装,它将尝试在适当的位置找到更多空间,但会回落到分配一个全新的指针并遍历所有内容.

The Realloc function is going to be a thin wrapper around the platform's function, which will try to find more room in-place, but fall back to allocating a totally new pointer and moving over all of the contents.

由于您使用的是Python,因此您很有可能是喜欢通过交互式实验学习的那种人.如果您不了解 ctypes.pythonapi .您绝对应该开始玩它.您可以从Python内部从 C API 调用几乎所有内容.不幸的是,您不能调用 #define 宏,或者在不做任何额外工作的情况下深入研究结构-但请参见 superhackyinternals ,了解如何进行一些额外的工作.(我认为我没有在列表中包含任何内容,而是查看int的工作原理,并且您应该能够从那里获取它-只是不要查看字符串,因为它们要复杂得多.)当然,要从您的解释器内部处理这些东西,您会经常遇到段错误,所以不要在有重要历史的会议上这样做.

Since you're using Python, there's a good chance you're the kind of person who likes to learn through interactive experimentation. If you don't know about ctypes.pythonapi. you should definitely start playing with it. You can call almost anything from the C API from inside Python. Unfortunately, you can't call #define macros, or dig into the structs without a bit of extra work—but see superhackyinternals for how you can do that bit of extra work. (I don't think I included anything there for lists, but look at how ints work, and you should be able to get it from there—just don't look at strings, because they're a lot more complicated.) Of course playing around with this stuff from inside your interpreter, you're going to segfault a lot, so don't do it in a session where you've got any important history.

当然,并不能保证每个Python实现都正确.只要实现可以提供文档化的接口和性能特征,它就可以构建所需的列表.例如,也许IronPython在.NET类库中使用了一些矢量类.当然,该类将在自己的框架下进行类似的重新分配和移动,但是IronPython不会在乎它是如何做到的(而且您甚至会在乎).

And of course that isn't guaranteed to be true for every Python implementation. As long as an implementation can provide the documented interface and performance characteristics, it can build lists however it wants. For example, maybe IronPython uses some vector class in the .NET class library. Of course that class will do similar reallocate-and-move under its own hood, but IronPython won't care how it does that (and you'll care even less).

这篇关于Python中的堆栈/列表-如何追加?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆