将不可变双端队列实现为平衡二叉树? [英] Implement an immutable deque as a balanced binary tree?

查看:13
本文介绍了将不可变双端队列实现为平衡二叉树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在思考如何将双端队列(即双端队列)实现为不可变数据结构.

I've been thinking for a while about how to go about implementing a deque (that is, a double-ended queue) as an immutable data structure.

似乎有不同的方法可以做到这一点.AFAIK,不可变数据结构通常是分层的,以便在修改操作(例如插入或删除项目)后可以重用其中的主要部分.

There seem to be different ways of doing this. AFAIK, immutable data structures are generally hierarchical, so that major parts of it can be reused after modifying operations such as the insertion or removal of an item.

Eric Lippert 有 两个 文章 在他的博客上关于此主题,以及 C# 中的示例实现.

Eric Lippert has two articles on his blog about this topic, along with sample implementations in C#.

他的两个实现都让我觉得比实际需要的更复杂.不能简单地将双端队列实现为二叉树,其中元素只能在最左侧"(前部)和非常右侧"(后部) 的树?

Both of his implementations strike me as more elaborate than is actually necessary. Couldn't deques simply be implemented as binary trees, where elements can only be inserted or removed on the very "left" (the front) and on the very "right" (the back) of the tree?

                               o
                              / 
                             …   …
                            /     
                           …       …
                          /      / 
              front -->  L   …   …   R  <-- back

此外,树将通过旋转保持合理的平衡:

Additionally, the tree would be kept reasonably balanced with rotations:

  • 从前面插入或从后面取出时向右旋转,以及
  • 从前面移除或从后面插入时向左旋转.

在我看来,Eric Lippert 是一个非常聪明的人,我非常尊重他,但他显然没有考虑过这种方法.因此我想知道,这是有充分理由的吗?我建议的实现双端队列的方式是否幼稚?

Eric Lippert is, in my opinion, a very smart person whom I deeply respect, yet he apparently didn't consider this approach. Thus I wonder, was it for a good reason? Is my suggested way of implementing deques naïve?

推荐答案

正如 Daniel 所指出的,使用众所周知的平衡搜索树(如 AVL 或红黑树)实现不可变双端队列会产生 Θ(lg n) 最坏情况的复杂性.Lippert 讨论的一些实现乍一看似乎很复杂,但有许多具有 o(lg n) 最差或平均或摊销复杂度的不可变双端队列是从平衡树以及两个简单的想法构建的:

As Daniel noted, implementing immutable deques with well-known balanced search trees like AVL or red-black trees gives Θ(lg n) worst case complexity. Some of the implementations Lippert discusses may seem elaborate at first glance, but there are many immutable deques with o(lg n) worst or average or amortized complexity that are built from balanced trees along with two simple ideas:

  1. 反转脊椎

要在传统的平衡搜索树上执行双端队列操作,我们需要访问末端,但我们只能访问中心.为了到达左端,我们必须导航左子指针,直到我们最终到达死胡同.如果没有所有导航工作,最好有一个指向左右两端的指针.事实上,我们真的不需要经常访问根节点.让我们存储一个平衡的搜索树,以便对两端的访问是 O(1).

To perform deque operations on a traditional balanced search tree, we need access to the ends, but we only have access to the center. To get to the left end, we must navigate left child pointers until we finally reach a dead end. It would be preferable to have a pointer to the left and right ends without all that navigation effort. In fact, we really don't need access to the root node very often. Let's store a balanced search tree so that access to the ends is O(1).

以下是一个 C 语言示例,说明您通常如何存储 AVL 树:

Here is an example in C of how you might normally store an AVL tree:

struct AVLTree {
  const char * value;
  int height;
  struct AVLTree * leftChild;
  struct AVLTree * rightChild;
};

为了设置树,以便我们可以从边缘开始向根移动,我们更改树并将沿路径从根到最左侧和最右侧孩子的所有指针存储在 reverse.(这些路径分别称为右脊).就像反转单链表一样,最后一个元素成为第一个元素,因此现在可以轻松访问最左边的子元素.

To set up the tree so that we can start at the edges and move towards the root, we change the tree and store all of the pointers along the paths from the root to the left and rightmost children in reverse. (These paths are called the left and right spine, respectively). Just like reversing a singly-linked list, the last element becomes the first, so the leftmost child is now easily accessible.

这有点难以理解.为了帮助解释它,假设您只对左侧脊椎执行此操作:

This is a little tricky to understand. To help explain it, imagine that you only did this for the left spine:

struct LeftSpine {
  const char * value;
  int height;
  struct AVLTree * rightChild;
  struct LeftSpine * parent;
};

从某种意义上说,最左边的孩子现在是树的根".如果你这样画树,看起来会很奇怪,但如果你只是把你正常的树画出来,把左边脊椎上的所有箭头都倒过来,LeftSpine结构体的含义应该会更清楚.现在可以立即访问树的左侧.右侧脊椎也可以这样做:

In some sense, the leftmost child is now the "root" of the tree. If you drew the tree this way, it would look very strange, but if you simply take your normal drawing of a tree and reverse all of the arrows on the left spine, the meaning of the LeftSpine struct should become clearer. Access to the left side of the tree is now immediate. The same can be done for the right spine:

struct RightSpine {
  double value;
  int height;
  struct AVLTree * leftChild;
  struct RightSpine * parent;
};

如果您同时存储左右脊椎以及中心元素,则可以立即访问两端.插入和删除可能仍然是 Ω(lg n),因为重新平衡操作可能需要遍历整个左右脊椎,但现在简单地查看左边和最右边的元素是 O(1).

If you store both a left and a right spine as well as the center element, you have immediate access to both ends. Inserting and deleting may still be Ω(lg n), because rebalancing operations may require traversing the entire left or right spine, but simply viewing to the left and rightmost elements is now O(1).

此策略的一个示例 用于使在 SML 和 Java 中实现的纯函数式陷阱(更多文档).这也是其他几个具有 o(lg n) 性能的不可变双端队列中的一个关键思想.

An example of this strategy is used to make purely functional treaps with implementations in SML and Java (more documentation). This is also a key idea in several other immutable deques with o(lg n) performance.

约束 Rabalancing 工作

如上所述,在 AVL 树的左端或最右端插入可能需要 Ω(lg n) 时间来遍历脊椎.下面是一个 AVL 树的示例,说明了这一点:

As noted above, inserting at the left or rightmost end of an AVL tree can require Ω(lg n) time for traversing the spine. Here is an example of an AVL tree that demonstrates this:

一个完整的二叉树通过归纳定义为:

A full binary tree is defined by induction as:

  • 高度为 0 的完整二叉树是一个空节点.
  • 高度为 n+1 的全二叉树有一个根节点,其中根节点为高度为 n 的全二叉树作为子节点.

将一个元素推到全二叉树的左边必然会增加树的最大高度.由于上面的 AVL 树将这些信息存储在每个节点中,并且由于沿完全二叉树左侧脊椎的每棵树也是一棵完全二叉树,因此将元素推送到恰好是完全二叉树的 AVL 双端队列的左侧将需要沿左脊椎增加 Ω(lg n) 高度值.

Pushing an element onto the left of a full binary tree will necessarily increase the maximum height of the tree. Since the AVL trees above store that information in each node, and since every tree along the left spine of a full binary tree is also a full binary tree, pushing an element onto the left of an AVL deque that happens to be a full binary tree will require incrementing Ω(lg n) height values along the left spine.

(关于此的两个注意事项:(a)您可以在不保留节点高度的情况下存储 AVL 树;相反,您只保留平衡信息(左高、右高甚至更高).这不会改变(b) 在 AVL 树中,您可能不仅需要进行 Ω(lg n) 平衡或高度信息更新,还需要进行 Ω(lg n) 重新平衡操作.我不记得细节了这个,它可能只针对删除,而不是插入.)

(Two notes on this: (a) You can store AVL trees without keeping the height in the node; instead you keep only balance information (left-taller, right-taller, or even). This doesn't change the performance of the above example. (b) In AVL trees, you might need to do not only Ω(lg n) balance or height information updates, but Ω(lg n) rebalancing operations. I don't recall the details of this, and it may be only on deletions, rather than insertions.)

为了实现o(lg n)个deque操作,我们需要限制这个工作.由平衡树表示的不可变双端队列通常至少使用以下策略之一:

In order to achieve o(lg n) deque operations, we need to limit this work. Immutable deques represented by balanced trees usually use at least one of the following strategies:

  • 预测需要重新平衡的地方.如果您使用的树需要 o(lg n) 重新平衡,但您知道需要重新平衡的位置并且您可以足够快地到达那里,您可以在 o(lg n) 中执行双端队列操作时间.使用此策略的双端队列不仅会在双端队列中存储两个指针(左右脊椎的末端,如上所述),还会将少量跳转指针存储到沿刺.然后 Deque 操作可以在 O(1) 时间内访问跳转指针指向的树的根.如果所有需要重新平衡(或更改节点信息)的地方都保留了 o(lg n) 个跳转指针,则 deque 操作可能需要 o(lg n) 时间.

  • Anticipate where rebalancing will be needed. If you are using a tree that requires o(lg n) rebalancing but you know where that rebalancing will be needed and you can get there quickly enough, you can perform your deque operations in o(lg n) time. Deques that use this as a strategy will store not just two pointers into the deque (the ends of the left and right spines, as discussed above), but some small number of jump pointers to places higher along the spines. Deque operations can then access the roots of the trees pointed to by the jump pointers in O(1) time. If o(lg n) jump pointers are maintained for all of the places where rebalancing (or changing node information) will be needed, deque operations can take o(lg n) time.

(当然,这使得树实际上是一个dag,因为跳转指针指向的spines上的树也被spine上的子节点指向.不可变数据结构通常不会与非树相处图,因为替换由多个其他节点指向的节点需要替换指向它的所有其他节点.我已经通过消除非跳转指针将这个问题解决了,将 dag 重新变成一棵树.可以然后将带有跳转指针的单向链表存储为一个列表列表.每个从属列表包含该列表的头部与其跳转指针之间的所有节点.这需要小心处理部分重叠的跳转指针,以及一个完整的对此的解释可能不合适.)

(Of course, this makes the tree actually a dag, since the trees on the spines pointed to by jump pointers are also pointed to by their children on the spine. Immutable data structures don't usually get along with non-tree graphs, since replacing a node pointed to by more than one other node requires replacing all of the other nodes that point to it. I have seen this fixed by just eliminating the non-jump pointers, turning the dag back into a tree. One can then store a singly-linked list with jump pointers as a list of lists. Each subordinate list contains all of the nodes between the head of that list and its jump pointer. This requires some care to deal with partially overlapping jump pointers, and a full explanation is probably not appropriate for this aside.)

这是 Tsakalidis 在他的论文用于本地化搜索的 AVL 树"中使用的技巧之一 允许在具有宽松平衡条件的 AVL 树上进行 O(1) 双端队列操作.这也是 Kaplan 和 Tarjan 在他们的论文Purelyfunctional, real-time deques withcatenation"后来由 Mihaesau 和 Tarjan 改进.Munro 等人的确定性跳过列表" 在这里也值得一提,尽管是翻译通过使用树将列表跳过到不可变设置有时会更改允许在末端附近进行这种有效修改的属性.有关翻译示例,请参阅 Messeguer 的跳过树,在并发方法中跳过列表的替代数据结构"", Dean 和 Jones 的探索跳过列表和二叉搜索树之间的二元性" 和 Lamoureux 和 Nickerson 的关于 B 树和确定性的等价性"跳过列表".

This is one of the tricks used by Tsakalidis in his paper "AVL Trees for localized search" to allow O(1) deque operations on AVL trees with a relaxed balance condition. It is also the main idea used by Kaplan and Tarjan in their paper "Purely functional, real-time deques with catenation" and a later refinement of that by Mihaesau and Tarjan. Munro et al.'s "Deterministic Skip Lists" also deserves a mention here, though translating skip lists to an immutable setting by using trees sometimes changes the properties that allow such efficient modification near the ends. For examples of the translation, see Messeguer's "Skip trees, an alternative data structure to Skip lists in a concurrent approach", Dean and Jones's "Exploring the duality between skip lists and binary search trees", and Lamoureux and Nickerson's "On the Equivalence of B-trees and deterministic skip lists".

批量处理.在上面的完整二叉树示例中,推送时不需要重新平衡,但是 Ω(lg n) 节点需要更新它们的高度或平衡信息.您可以简单地将末端的脊椎标记为需要增量,而不是实际进行增量.

Do the work in bulk. In the full binary tree example above, no rebalancing is needed on a push, but Ω(lg n) nodes need to have their height or balance information updated. Instead of actually doing the incrementation, you could simply mark the spine at the ends as needing incrementation.

理解这个过程的一种方法是类比二进制数.(2^n)-1 由 n 个 1 的字符串以二进制表示.给这个数加1时,需要把所有的1都改成0,然后在最后加一个1.下面的 Haskell 将二进制数编码为非空的比特串,最低有效在前.

One way to understand this process is by analogy to binary numbers. (2^n)-1 is represented in binary by a string of n 1's. When adding 1 to this number, you need to change all of the 1's to 0's and then add a 1 at the end. The following Haskell encodes binary numbers as non-empty strings of bits, least significant first.

data Bit = Zero | One

type Binary = (Bit,[Bit])

incr :: Binary -> Binary
incr (Zero,x) = (One,x)
incr (One,[]) = (Zero,[One])
incr (One,(x:xs)) = 
    let (y,ys) = incr (x,xs)
    in (Zero,y:ys)

incr 是一个递归函数,对于 (One,replicate k One) 形式的数字,incr 调用自身 Ω(k) 次.

incr is a recursive function, and for numbers of the form (One,replicate k One), incr calls itself Ω(k) times.

相反,我们可以仅通过组中的位数来表示相等位的组.如果相邻位或位组相等(在值上,而不是在数量上),则将它们合并为一组.我们可以在 O(1) 时间内递增:

Instead, we might represent groups of equal bits by only the number of bits in the group. Neighboring bits or groups of bits are combined into one group if they are equal (in value, not in number). We can increment in O(1) time:

data Bits = Zeros Int | Ones Int

type SegmentedBinary = (Bits,[Bits])

segIncr :: SegmentedBinary -> SegmentedBinary
segIncr (Zeros 1,[]) = (Ones 1,[])
segIncr (Zeros 1,(Ones n:rest)) = (Ones (n+1),rest)
segIncr (Zeros n,rest) = (Ones 1,Zeros (n-1):rest)
segIncr (Ones n,[]) = (Zeros n,[Ones 1])
segIncr (Ones n,(Zeros 1:Ones m:rest)) = (Zeros n,Ones (m+1):rest)
segIncr (Ones n,(Zeros p:rest)) = (Zeros n,Ones 1:Zeros (p-1):rest)

由于 segIncr 不是递归的,并且不会在 Int 上调用除加减以外的任何函数,您可以看到它需要 O(1) 时间.

Since segIncr is not recursive and doesn't call any functions other than plus and minus on Ints, you can see it takes O(1) time.

上面标题为预测需要重新平衡的位置"的部分中提到的一些双端队列实际上使用了一种不同的受数字启发的技术,称为冗余数字系统",将重新平衡工作限制为 O(1) 并快速定位.冗余的数字表示很吸引人,但对于这个讨论来说可能太遥远了.Elmasry 等人的严格正则数系统和数据结构"是开始阅读该主题的好地方.Hinze 的引导单边灵活数组" 也可能有用.

Some of the deques mentioned in the section above entitled "Anticipate where rebalancing will be needed" actually use a different numerically-inspired technique called "redundant number systems" to limit the rebalancing work to O(1) and locate it quickly. Redundant numerical representations are fascinating, but possibly too far afield for this discussion. Elmasry et al.'s "Strictly-regular number system and data structures" is not a bad place to start reading about that topic. Hinze's "Bootstrapping one-sided flexible arrays" may also be useful.

使数据结构持久化" 中,Driscoll 等人.描述 懒惰的重新着色,他们将其归因于 Tsakalidis.他们将其应用于红黑树,可以在插入或删除后通过 O(1) 次旋转重新平衡(但 Ω(lg n) 重新着色)(参见 Tarjan 的在 O(1) 次旋转中更新平衡树").该想法的核心是标记需要重新着色但不旋转的节点的大路径.在 布朗 &Tarjan 的快速合并算法".(同一作品的较新版本使用 2-3 棵树;我没有读过较新的,我不知道它们是否使用了惰性重新着色之类的技术.)

In "Making data structures persistent", Driscoll et al. describe lazy recoloring, which they attribute to Tsakalidis. They apply it to red-black trees, which can be rebalanced after insertion or deletion with O(1) rotations (but Ω(lg n) recolorings) (see Tarjan's "Updataing a balanced tree in O(1) rotations"). The core of the idea is to mark a large path of nodes that need to be recolored but not rotated. A similar idea is used on AVL trees in the older versions of Brown & Tarjan's "A fast merging algorithm". (Newer versions of the same work use 2-3 trees; I have not read the newer ones and I do not know if they use any techniques like lazy recoloring.)

随机化.上面提到的 Treap 可以在功能设置中实现,以便它们在平均 O(1) 时间上执行双端队列操作.由于双端队列不需要检查它们的元素,这个平均值不容易受到恶意输入性能下降的影响,不像简单(没有重新平衡)二叉搜索树,它在平均输入上速度很快.Treap 使用独立的随机位源,而不是依赖于数据的随机性.

Randomize. Treaps, mentioned above, can be implemented in a functional setting so that they perform deque operations on O(1) time on average. Since deques do not need to inspect their elements, this average is not susceptible to malicious input degrading performance, unlike simple (no rebalancing) binary search trees, which are fast on average input. Treaps use an independent source of random bits instead of relying on randomness from the data.

在持久设置中,treaps 可能容易受到恶意输入的性能下降的影响,对手既可以 (a) 使用旧版本的数据结构,又可以 (b) 测量操作的性能.因为它们没有任何最坏情况的平衡保证,所以陷阱可能会变得非常不平衡,尽管这种情况很少发生.如果对手等待需要很长时间的双端队列操作,她可以重复启动相同的操作,以测量和利用可能不平衡的树.

In a persistent setting, treaps may be susceptible to degraded performance from malicious input with an adversary who can both (a) use old versions of a data structure and (b) measure the performance of operations. Because they do not have any worst-case balance guarantees, treaps can become quite unbalanced, though this should happen rarely. If an adversary waits for a deque operation that takes a long time, she can initiate that same operation repeatedly in order to measure and take advantage of a possibly unbalanced tree.

如果这不是问题,treaps 是一种非常简单的数据结构.它们非常接近上述 AVL 脊柱树.

If this is not a concern, treaps are an attractively simple data structure. They are very close to the AVL spine tree described above.

上面提到的跳过列表也可能适用于具有 O(1) 平均时间双端队列操作的函数式实现.

Skip lists, mentioned above, might also be amenable to functional implementations with O(1) average-time deque operations.

用于限制重新平衡工作的前两种技术需要对数据结构进行复杂的修改,同时通常可以对双端队列操作的复杂性进行简单的分析.随机化以及下一个技术具有更简单的数据结构但更复杂的分析.Seidel 和 Aragon 的原始分析并不平凡,还有一些复杂的分析使用比上述论文中更高级的数学计算精确概率——参见 Flajolet 等人的随机二叉搜索树中的模式".

The first two techniques for bounding the rebalancing work require complex modifications to data structures while usually affording a simple analysis of the complexity of deque operations. Randomization, along with the next technique, have simpler data structures but more complex analysis. The original analysis by Seidel and Aragon is not trivial, and there is some complex analysis of exact probabilities using more advanced mathematics than is present in the papers cited above -- see Flajolet et al.'s "Patterns in random binary search trees".

摊销.有几个平衡的树,从根部向上看(如上面的反转脊椎"中所述),提供 O(1) 摊销插入和删除时间.单个操作可能需要 Ω(lg n) 时间,但它们使树处于如此良好的状态,以至于在昂贵的操作之后进行的大量操作将是廉价的.

Amortize. There are several balanced trees that, when viewed from the roots up (as explained in "Reverse the Spines", above), offer O(1) amortized insertion and deletion time. Individual operations can take Ω(lg n) time, but they put the tree in such a nice state that a large number of operations following the expensive operation will be cheap.

不幸的是,当旧版本的树仍然存在时,这种分析不起作用.用户可以在旧的、几乎不平衡的树上执行多次操作,而无需任何干预廉价操作.

Unfortunately, this kind of analysis does not work when old versions of the tree are still around. A user can perform operations on the old, nearly-out-of-balance tree many times without any intervening cheap operations.

Chris 发明了一种在持久设置中获得摊销界限的方法冈崎.解释摊销如何在使用任意旧版本数据结构的能力中幸存下来并不简单,但如果我没记错的话,Okasaki 的第一篇(据我所知)关于这个主题的论文 有一个非常清楚的解释.有关更全面的解释,请参阅他的论文他的书.

One way to get amortized bounds in a persistent setting was invented by Chris Okasaki. It is not simple to explain how the amortization survives the ability to use arbitrary old versions of a data structure, but if I remember correctly, Okasaki's first (as far as I know) paper on the subject has a pretty clear explanation. For more comprehensive explanations, see his thesis or his book.

据我所知,有两个基本要素.首先,不是仅仅保证在每个昂贵的操作之前发生一定数量的廉价操作(通常的摊销方法),您实际上指定并设置了特定昂贵的操作之前执行将为此付出代价的廉价操作.在某些情况下,只有在经过许多廉价的干预步骤之后才计划开始(和完成)操作.在其他情况下,该操作实际上仅在未来安排了 O(1) 步,但廉价的操作可能会执行部分昂贵的操作,然后再重新安排更多的操作以备后用.如果攻击者希望一遍又一遍地重复昂贵的操作,实际上每次都重复使用相同的预定操作.这种分享是第二个要素的来源.

As I understand it, there are two essential ingredients. First, instead of just guaranteeing that a certain number of cheap operations occur before each expensive operation (the usual approach to amortization) you actually designate and set up that specific expensive operation before performing the cheap operations that will pay for it. In some cases, the operation is scheduled to be started (and finished) only after many intervening cheap steps. In other cases, the operation is actually scheduled only O(1) steps in the future, but cheap operations may do part of the expensive operation and then reschedule more of it for later. If an adversary looking to repeat an expensive operation over and over again is actually reusing the same scheduled operation each time. This sharing is where the second ingredient comes in.

计算是使用 laziness 设置的.惰性值不会立即计算,但一旦执行,其结果将被保存.客户端第一次需要检查惰性值时,会计算其值.以后的客户端可以直接使用该缓存值,而无需重新计算.

The computation is set up using laziness. A lazy value is not computed immediately, but, once performed, its result is saved. The first time a client needs to inspect a lazy value, its value is computed. Later clients can use that cached value directly, without having to recompute it.

#include <stdlib.h>

struct lazy {
  int (*oper)(const char *);
  char * arg;
  int* ans;
};

typedef struct lazy * lazyop;

lazyop suspend(int (*oper)(const char *), char * arg) {
  lazyop ans = (lazyop)malloc(sizeof(struct lazy));
  ans->oper = oper;
  ans->arg = arg;
  return ans;
}

void force(lazyop susp) {
  if (0 == susp) return;
  if (0 != susp->ans) return;
  susp->ans = (int*)malloc(sizeof(int));
  *susp->ans = susp->oper(susp->arg);
}

int get(lazyop susp) {
  force(susp);
  return *susp->ans;
}

一些 ML 中包含惰性结构,Haskell 默认是惰性的.在幕后,懒惰是一种突变,这导致一些作者将其称为副作用".如果这种副作用在首先选择不可变数据结构的任何原因下不能很好地发挥作用,那可能会被认为是糟糕的,但是,另一方面,将懒惰视为副作用允许应用如Kaplan、Okasaki 和 Tarjan 在题为简单的 Confluently Persistent Catenable Lists".

Laziness constructs are included in some MLs, and Haskell is lazy by default. Under the hood, laziness is a mutation, which leads some authors to call it a "side effect". That might be considered bad if that kind of side effect doesn't play well with whatever the reasons were for selecting an immutable data structure in the first place, but, on the other hand, thinking of laziness as a side effect allows the application of traditional amortized analysis techniques to persistent data structures, as mentioned in a paper by Kaplan, Okasaki, and Tarjan entitled "Simple Confluently Persistent Catenable Lists".

再次考虑上面的对手,他们试图反复强制计算昂贵的操作.懒惰值的第一个力之后,每一个剩余的力都是廉价的.

Consider again the adversary from above who is attempting to repeatedly force the computation of an expensive operation. After the first force of the lazy value, every remaining force is cheap.

在他的书中,Okasaki 解释了如何使用每个操作所需的 O(1) 分摊时间构建双端队列.它本质上是一个 B+ 树,它是一棵树,其中所有元素都存储在叶子上,节点的子节点数量可能不同,并且每个叶子都处于相同的深度.Okasaki 使用了上面讨论的脊椎反转方法,他将脊椎悬挂(即存储为惰性值)叶元素上方.

In his book, Okasaki explains how to build deques with O(1) amortized time required for each operation. It is essentially a B+-tree, which is a tree where all of the elements are stored at the leaves, nodes may vary in how many children they have, and every leaf is at the same depth. Okasaki uses the spine-reversal method discussed above, and he suspends (that is, stores as a lazy value) the spines above the leaf elements.

Hinze 和 Paterson 的结构称为手指树:一种简单的通用数据结构" 介于 Okasaki 设计的双端队列和卡普兰和塔里安.Hinze 和 Paterson 的结构变得非常流行.

A structure by Hinze and Paterson called "Finger trees: a simple general-purpose data structure" is halfway between the deques designed by Okasaki and the "Purely functional representations of catenable sorted lists" of Kaplan and Tarjan. Hinze and Paterson's structure has become very popular.

作为摊销分析难以理解的证据,Hinze 和 Paterson 的手指树是 经常 已实现 没有 懒惰,使时间界限不是 O(1) 但仍然是 O(lg n).一种似乎使用惰性的实现是 功能 dotnet 中的一个.该项目还包括一个 惰性值的实现在 C# 中,如果我上面的解释缺乏,这可能有助于解释它们.

As a evidence of how tricky the amortized analysis is to understand, Hinze and Paterson's finger trees are frequently implemented without laziness, making the time bounds not O(1) but still O(lg n). One implementation that seems to use laziness is the one in functional-dotnet. That project also includes an implementation of lazy values in C# which might help explain them if my explanation above is lacking.

双端队列可以实现为二叉树吗?是的,它们在持续使用时最坏情况的复杂性不会比 Eric Lippert 提出的更糟糕.然而,Eric 的树实际上不够复杂,无法在持久设置中获得 O(1) 的双端队列操作,尽管如果您愿意接受摊销性能,复杂度只有很小的余量(使中心变得懒惰).假设对手不太狡猾,一个不同但也很简单的陷阱视图可以在功能设置中获得 O(1) 的预期性能.在函数设置中使用树状结构获得 O(1) 最坏情况的双端队列操作需要比 Eric 的实现更复杂一些.

Could deques be implemented as binary trees? Yes, and their worst-case complexity when used persistently would be no worse than those presented by Eric Lippert. However, Eric's trees are actually not complicated enough to get O(1) deque operations in a persistent setting, though only by a small complexity margin (making the center lazy) if you are willing to accept amortized performance. A different but also simple view of treaps can get O(1) expected performance in a functional setting, assuming an adversary who is not too tricky. Getting O(1) worst-case deque operations with a tree-like structure in a functional setting requires a good bit more complexity than Eric's implementations.

最后的两个注释(尽管这是一个非常有趣的话题,我保留以后添加更多内容的权利):-)

Two final notes (though this is a very interesting topic and I reserve the right to add more later) :-)

  1. 几乎所有上面提到的双端队列也是指搜索树.在函数设置中,这意味着它们可以在 O(lg(min(i,ni))) 时间内在第 i 个元素处拆分,并且两个大小为 n 和 m 的树可以在 O(lg(min(n,m))) 时间.

  1. Nearly all of the deques mentioned above are finger search trees as well. In a functional setting this means they can be split at the ith element in O(lg(min(i,n-i))) time and two trees of size n and m can be concatenated in O(lg(min(n,m))) time.

我知道两种不使用树的双端队列实现方式.Okasaki 在他的书和论文以及我上面链接的论文中介绍了一个.另一种使用称为全局重建"的技术,并在 Chuang 和 Goldberg 的实时双端队列,多头图灵机,纯函数式编程".

I know of two ways of implementing deques that don't use trees. Okasaki presents one in his book and thesis and the paper I linked to above. The other uses a technique called "global rebuilding" and is presented in Chuang and Goldberg's "Real-time deques, multihead Turing machines, and purely functional programming".

这篇关于将不可变双端队列实现为平衡二叉树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆