为什么将双端队列实现为链接列表而不是循环数组? [英] Why is deque implemented as a linked list instead of a circular array?

查看:122
本文介绍了为什么将双端队列实现为链接列表而不是循环数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将CPython deque 实现作为双向链接列表64个项目大小的块(数组)。除了链接列表两端的块外,所有块均已满。 IIUC,当 pop / popleft 删除块中的最后一项时,释放块。在追加 / 追加左尝试添加新项目且相关块已满时分配它们。

CPython deque is implemented as a doubly-linked list of 64-item sized "blocks" (arrays). The blocks are all full, except for the ones at either end of the linked list. IIUC, the blocks are freed when a pop / popleft removes the last item in the block; they are allocated when append/appendleft attempts to add a new item and the relevant block is full.

我了解列出的使用块的链接列表而不是项目的链接列表的优点


  • 减少指向prev和每个项目中的下一个

  • 减少执行 malloc / 免费的运行时成本每个添加/删除的项目

  • 通过将连续的指针彼此相邻来改善缓存位置

  • reduce memory cost of pointers to prev and next in every item
  • reduce runtime cost of doing malloc/free for every item added/removed
  • improve cache locality by placing consecutive pointers next to each other

但是为什么不首先使用单个动态大小的环形数组而不是双向链接列表呢?

But why wasn't a single dynamically-sized circular array used instead of the doubly-linked list in the first place?

AFAICT,环形数组将保留所有上述内容优势,并保持 pop * / append * 的(摊销)成本为 O(1)(通过总计,就像在 list 中一样)。此外,它将按索引查找的成本从当前的 O(n)提高到 O(1) 。循环数组也将更易于实现,因为它可以重用 list 的大部分实现。

AFAICT, the circular array would preserve all the above advantages, and maintain the (amortized) cost of pop*/append* at O(1) (by overallocating, just like in list). In addition, it would improve the cost of lookup by index from the current O(n) to O(1). A circular array would also be simpler to implement, since it can reuse much of the list implementation.

我可以看到一个支持C ++之类的链表的参数,其中可以使用指针或迭代器在 O(1)中从中间删除项目;但是,python deque 没有API可以做到这一点。

I can see an argument in favor of a linked list in languages like C++, where removal of an item from the middle can be done in O(1) using a pointer or iterator; however, python deque has no API to do this.

推荐答案

从我对python-dev邮件列表的答复中:

Adapted from my reply on the python-dev mailing list:

双端队列的主要目的是提高弹出和推送的效率。这就是当前的实现方式:无论双端队列中有多少项目,每次推送或弹出的最坏情况下的恒定时间。无论大小,都超过了摊销O(1)。这就是为什么这样做的原因。

The primary point of a deque is to make popping and pushing at both ends efficient. That's what the current implementation does: worst-case constant time per push or pop regardless of how many items are in the deque. That beats "amortized O(1)" in the small and in the large. That's why it was done this way.

因此,其他一些双端队列方法比列表慢,但谁在乎呢?例如,我曾经与双端队列一起使用的唯一索引是0和-1(以窥视双端队列的一端或另一端),并且该实现还使可以恒定时间访问这些特定索引。

Some other deque methods are consequently slower than they are for lists, but who cares? For example, the only indices I've ever used with a deque are 0 and -1 (to peek at one end or the other of a deque), and the implementation makes accessing those specific indices constant-time too.

确实是Jim Fasarakis Hilliard在他的评论中引用的来自Raymond Hettinger的消息:

Indeed, the message from Raymond Hettinger referenced by Jim Fasarakis Hilliard in his comment:

https://www.mail-archive.com/python-dev@python.org/msg25024.html

确认


<$ c的唯一原因$ c> __ getitem __ 的目的是支持快速访问头和尾,而无需实际弹出值

The only reason that __getitem__ was put in was to support fast access to the head and tail without actually popping the value

这篇关于为什么将双端队列实现为链接列表而不是循环数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆