Swift 内部是如何管理数组的? [英] How does Swift manage Arrays internally?

查看:39
本文介绍了Swift 内部是如何管理数组的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道 Swift 是如何在内部管理数组的?Apple 的语言指南 只处理用法,没有详细说明内部结构.

I would like to know how Swift managed arrays internally? Apple's language guide only handles usage, but does not elaborate on internal structures.

作为一名 Java 开发人员,我习惯于将裸"数组视为非常静态和固定的数据结构.我知道这在 Swift 中并非如此.在 Swift 中,除了在 Java 中,您还可以改变数组的长度,也可以执行插入和删除操作.在 Java 中,我习惯于根据我想要使用该结构执行的操作来决定我想要使用的数据结构(简单数组、ArrayList、LinkedList 等),从而优化我的代码以获得更好的性能.

As a Java-developer I am used to looking at "bare" arrays as a very static and fixed data structure. I know that this is not true in Swift. In Swift, other than in Java, you can mutate the length of an array and also perform insert and delete operations. In Java I am used to decide what data structure I want to use (simple arrays, ArrayList, LinkedList etc.) based on what operations I want to perform with that structure and thus optimise my code for better performance.

总而言之,我想知道数组是如何在 Swift 中实现的.它们在内部是否作为(双)链表进行管理?有什么可以与 Java 的 Collection Framework 相媲美的,以便进行调优以获得更好的性能?

In conclusion, I would like to know how arrays are implemented in Swift. Are they internally managed as (double) linked lists? And is there anything available comparable to Java's Collection Framework in order to tune for better performance?

推荐答案

您可以在 Swift 标准库中 Array 上方的注释中找到很多关于 Array 的信息.要看到这一点,您可以在操场中 cmd-opt-click Array,或者您可以在非官方的 SwiftDoc 页面.

You can find a lot of information on Array in the comment above it in the Swift standard library. To see this, you can cmd-opt-click Array in a playground, or you could look at it in the unofficial SwiftDoc page.

转述其中的一些信息来回答您的问题:

To paraphrase some of the info from there to answer your questions:

在 Swift 中创建的数组将它们的值保存在一个连续的内存区域中.因此,您可以有效地将 Swift 数组传递到需要这种结构的 C API 中.

Arrays created in Swift hold their values in a contiguous region of memory. For this reason, you can efficiently pass a Swift array into a C API that requires that kind of structure.

正如您所提到的,数组可以随着您向其附加值而增长,并且在某些时候这意味着分配了一个新的、更大的内存区域,并将以前的值复制到其中.正是因为这个原因,它声明像 append 可能这样的操作是 O(n)——也就是说,执行 append 操作的最坏情况时间成比例地增长数组的当前大小(因为复制值所花费的时间).

As you mention, an array can grow as you append values to it, and at certain points that means that a fresh, larger, region of memory is allocated, and the previous values are copied into it. It is for this reason that its stated that operations like append may be O(n) – that is, the worst-case time to perform an append operation grows in proportion to the current size of the array (because of the time taken to copy the values over).

然而,当数组必须增加其存储量时,它每次分配的新存储量呈指数增长,这意味着重新分配会随着您追加而变得越来越少,这意味着追加所有调用的摊销"时间接近恒定时间.

However, when the array has to grow its storage, the amount of new storage it allocates each time grows exponentially, which means that reallocations become rarer and rarer as you append, which means the "amortized" time to append over all calls approaches constant time.

数组还有一个方法,reserveCapacity,它允许你通过请求数组预先分配一些最小的空间来抢先避免在调用 append 时重新分配.如果您提前知道计划在数组中保存多少个值,则可以使用它.

Arrays also have a method, reserveCapacity, that allows you to preemptively avoid reallocations on calling append by requesting the array allocate itself some minimum amount of space up front. You can use this if you know ahead of time how many values you plan to hold in the array.

在数组中间插入一个新值也是O(n),因为数组保存在连续的内存中,所以插入一个新值涉及将后续值混洗到最后.与附加不同,这不会在多次调用中得到改善.这与您可以在 O(1) 中插入的链表(即常数时间)非常不同.但请记住,与链表不同,数组也可以在恒定时间内随机访问,这是一个很大的权衡.

Inserting a new value into the middle of an array is also O(n), because arrays are held in contiguous memory, so inserting a new value involves shuffling subsequent values along to the end. Unlike appending though, this does not improve over multiple calls. This is very different from, say, a linked list where you can insert in O(1) i.e. constant time. But bear in mind the big tradeoff is that arrays are also randomly accessible in constant time, unlike linked lists.

就地更改数组中的单个值(即通过下标分配)应该是 O(1)(subscript 实际上没有记录注释但这是一个非常安全的赌注).这意味着如果您创建一个数组,填充它,然后不追加或插入其中,它的行为应该类似于 Java 数组的性能.

Changes to single values in the array in-place (i.e. assigning via a subscript) should be O(1) (subscript doesn't actually have a documenting comment but this is a pretty safe bet). This means if you create an array, populate it, and then don't append or insert into it, it should behave similarly to a Java array in terms of performance.

有一个警告——数组具有值"语义.这意味着如果您有一个数组变量 a,并将其分配给另一个数组变量 b,这实质上就是复制该数组.a 中的值的后续更改不会影响 b,并且更改 b 不会影响 a.这与引用"语义不同,其中 ab 都指向同一个数组,并且通过 a 对其进行的任何更改都将反映到有人通过b查看它.

There's one caveat to all this – arrays have "value" semantics. This means if you have an array variable a, and you assign it to another array variable b, this is essentially copying the array. Subsequent changes to the values in a will not affect b, and changing b will not affect a. This is unlike "reference" semantics where both a and b point to the same array and any changes made to it via a would be reflected to someone looking at it via b.

然而,Swift 数组实际上是写时复制".也就是说,当您将 a 分配给 b 时,实际上不会发生复制.它仅在两个变量之一发生变化(变异")时发生.这带来了很大的性能优势,但它确实意味着如果两个阵列引用同一个存储,因为自复制以来都没有执行过写入,像下标分配这样的更改确实会产生一次性复制整个阵列的成本点.

However, Swift arrays are actually "Copy-on-Write". That is, when you assign a to b no copying actually takes place. It only happens when one of the two variables is changed ("mutated"). This brings a big performance benefit, but it does mean that if two arrays are referencing the same storage because neither has performed a write since the copy, a change like a subscript assign does have a one-off cost of duplicating the entire array at that point.

在大多数情况下,除非在极少数情况下(尤其是在处理小到中等大小的数组时),否则您不需要担心任何这些,但是如果性能对您来说至关重要,那么您绝对值得熟悉一下包含该链接中的所有文档.

For the most part, you shouldn't need to worry about any of this except in rare circumstances (especially when dealing with small-to-modest-size arrays), but if performance is critical to you it's definitely worth familiarizing yourself with all of the documentation in that link.

这篇关于Swift 内部是如何管理数组的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆