链表VS动态数组,使用向量类实现一个栈 [英] Linked list vs dynamic array for implementing a stack using vector class

查看:157
本文介绍了链表VS动态数组,使用向量类实现一个栈的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在读上执行堆叠的两种不同的方法:链表和动态数组。链表超过动态数组的主要优点是,该链接的表没有被调整,而动态数组不得不被调整大小,如果太多的元素插入因此浪费的时间和内存很多。

I was reading up on the two different ways of implementing a stack: linked list and dynamic arrays. The main advantage of a linked list over a dynamic array was that the linked list did not have to be resized while a dynamic array had to be resized if too many elements were inserted hence wasting alot of time and memory.

这让我想如果这是真实的C ++(因为有每当新的元素插入自动调整大小的矢量类)?

That got me wondering if this is true for C++ (as there is a vector class which automatically resizes whenever new elements are inserted)?

推荐答案

这是很难比较两个,因为他们的内存使用的模式有很大的不同。

It's difficult to compare the two, because the patterns of their memory usage are quite different.

矢量调整

一个矢量动态调整大小本身的需要。它是通过分配一个新的内存块,移动(或复制)数据从旧的块,以新的块,释放旧的。在典型情况下,新的块是1.5倍旧的尺寸(流行的看法相反,2倍似乎是在实践中非常不寻常的)。这意味着很短的时间,而重新分配,它需要的内存等于约2.5倍之多你实际存储的数据。对剩下的时间里,块这是在使用中是最低的2/3 RDS 全和最大完全充满。如果所有的大小同样有可能,我们可以期望它平均约为5/6 的部份满。看它从另一个方向,我们可以预期约1/6 ,或约17%至被浪费在任何特定时间的空间。

A vector resizes itself dynamically as needed. It does that by allocating a new chunk of memory, moving (or copying) data from the old chunk to the new chunk, the releasing the old one. In a typical case, the new chunk is 1.5x the size of the old (contrary to popular belief, 2x seems to be quite unusual in practice). That means for a short time while reallocating, it needs memory equal to roughly 2.5x as much as the data you're actually storing. The rest of the time, the "chunk" that's in use is a minimum of 2/3rds full, and a maximum of completely full. If all sizes are equally likely, we can expect it to average about 5/6ths full. Looking at it from the other direction, we can expect about 1/6th, or about 17% of the space to be "wasted" at any given time.

当我们通过不断的因素调整的这样的(而不是,例如,一直在增加块的具体尺寸,比如在4Kb的增量增长的),我们得到了什么叫做分期常量时间增加。换句话说,如在阵列的增长,调整大小较少呈指数发生。的次数的数组中的项已被复制的平均数量趋于一个常数(通常大约3,但取决于所使用的生长因子)。

When we do resize by a constant factor like that (rather than, for example, always adding a specific size of chunk, such as growing in 4Kb increments) we get what's called amortized constant time addition. In other words, as the array grows, resizing happens exponentially less often. The average number of times items in the array have been copied tends to a constant (usually around 3, but depends on the growth factor you use).

链表分配

使用一个链表,情况是相当不同的。我们从来没有看到调整,所以我们看不到额外的时间和内存使用情况的一些插入。与此同时,我们的执行的看到用额外的时间和内存基本的所有的时间。特别是,在该链接的表的每个节点需要包含一个指针到下一个节点。取决于相比,指针的大小在该节点的数据的大小,这可能会导致显著开销。例如,假设你需要一个堆栈 INT 取值。在典型情况下,一个 INT 是同样大小的指针,那将意味着50%的开销 - 所有的时间。这是越来越普遍的指针是的的比 INT ;两倍的大小是相当普遍的(64位指针,32位int)。在这样的情况下,则有67〜%的开销 - 即,很明显的是,每一个节点两倍的空间致力于指针为存储数据

Using a linked list, the situation is rather different. We never see resizing, so we don't see extra time or memory usage for some insertions. At the same time, we do see extra time and memory used essentially all the time. In particular, each node in the linked list needs to contain a pointer to the next node. Depending on the size of the data in the node compared to the size of a pointer, this can lead to significant overhead. For example, let's assume you need a stack of ints. In a typical case where an int is the same size as a pointer, that's going to mean 50% overhead -- all the time. It's increasingly common for a pointer to be larger than an int; twice the size is fairly common (64-bit pointer, 32-bit int). In such a case, you have ~67% overhead -- i.e., obviously enough, each node devoting twice as much space to the pointer as the data being stored.

不幸的是,这是冰山的往往只是冰山一角。在典型的链表,每个节点动态地单独分配。至少,如果你存储的小数据项(如 INT )分配给一个节点可以(通常是)内存甚至比你实际请求量。所以 - 你问12个字节的内存来保存一个int和指针 - 但记忆你得到的块很可能被改为四舍五入到16或32个字节。现在,你看至少75%的开销,并很可能〜88%。

Unfortunately, that's often just the tip of the iceberg. In a typical linked list, each node is dynamically allocated individually. At least if you're storing small data items (such as int) the memory allocated for a node may be (usually will be) even larger than the amount you actually request. So -- you ask for 12 bytes of memory to hold an int and a pointer -- but the chunk of memory you get is likely to be rounded up to 16 or 32 bytes instead. Now you're looking at overhead of at least 75% and quite possibly ~88%.

至于速度的推移,情况非常相似:分配和动态释放内存往往是相当缓慢的。堆管理器通常有空闲内存块,并有花时间通过他们寻找,找到块最适合你要求的大小。然后(通常)具有该块分割成两部分,一是以满足您的配置,另外剩余的内存就可以用它来满足其他分配。同样,当你空闲的内存,它通常可以追溯到同一个空闲块,并检查列表中是否有内存的相邻块已经免费的,所以它可以加入两个一起回来。

As far as speed goes, the situation is rather similar: allocating and freeing memory dynamically is often quite slow. The heap manager typically has blocks of free memory, and has to spend time searching through them to find the block that's most suited to the size you're asking for. Then it (typically) has to split that block into two pieces, one to satisfy your allocation, and another of the remaining memory it can use to satisfy other allocations. Likewise, when you free memory, it typically goes back to that same list of free blocks and checks whether there's an adjoining block of memory already free, so it can join the two back together.

分配和管理大量的存储器块是昂贵的

Allocating and managing lots of blocks of memory is expensive.

缓存使用

最后,近期的处理器,我们碰到的另一个重要因素:高速缓存的使用。在载体中的情况下,我们有所有的数据权彼此相邻。然后,载体这是在使用的部分结束后,我们有一些空的存储。这导致了优异的高速缓存使用情况 - 我们正在使用的数据得到缓存;我们没有使用的数据对高速缓存都很小或没有影响。

Finally, with recent processors we run into another important factor: cache usage. In the case of a vector, we have all the data right next to each other. Then, after the end of the part of the vector that's in use, we have some empty memory. This leads to excellent cache usage -- the data we're using gets cached; the data we're not using has little or no effect on the cache at all.

通过一个链表,指针(并在每个节点可能的开销)分布在我们的名单。也就是说,每一个数据,我们关心的了,就在旁边,指针的开销,而空的空间分配给了我们不使用的节点。总之,有效的高速缓存的大小降低了大约相同的因素,因为在列表中的每个节点的总开销 - 也就是说,我们可能很容易看到的只有1/8 个存储我们关心的日期,和7/8 的部份专门用于储存指针和/或纯粹的垃圾。

With a linked list, the pointers (and probable overhead in each node) are distributed throughout our list. I.e., each piece of data we care about has, right next to it, the overhead of the pointer, and the empty space allocated to the node that we're not using. In short, the effective size of the cache is reduced by about the same factor as the overall overhead of each node in the list -- i.e., we might easily see only 1/8th of the cache storing the date we care about, and 7/8ths devoted to storing pointers and/or pure garbage.

摘要

一个链表可以很好地工作时具有相对小数目的节点,其中每一个是单独地相当大。如果(这是一个堆栈更典型的),你所面对的是比较多的项目,其中每个单独相当小,你的的不太可能看到一个节省时间或内存使用情况。恰恰相反,对于这样的情况下,一个链表是更可能基本上浪费了大量的时间和存储器的

A linked list can work well when you have a relatively small number of nodes, each of which is individually quite large. If (as is more typical for a stack) you're dealing with a relatively large number of items, each of which is individually quite small, you're much less likely to see a savings in time or memory usage. Quite the contrary, for such cases, a linked list is much more likely to basically waste a great deal of both time and memory.

这篇关于链表VS动态数组,使用向量类实现一个栈的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆