查找二叉堆的最后一个元素 [英] Finding last element of a binary heap

查看:31
本文介绍了查找二叉堆的最后一个元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

引用维基百科:

<块引用>

完全可以接受使用传统二叉树数据结构实现一个二叉堆.有寻找相邻的问题最后一层的元素添加元素时的二元堆可以解决算法上...

关于这种算法如何工作的任何想法?

我无法找到有关此问题的任何信息,因为大多数二进制堆都是使用数组实现的.

感谢任何帮助.

<小时>

最近,我注册了一个 OpenID 帐户,但无法编辑我的初始帖子或评论答案.这就是我通过这个答案做出回应的原因.对此表示抱歉.

<小时>

引用米奇小麦:

<块引用>

@Yse:你的问题是我如何找到二叉堆的最后一个元素"?

是的,是的.或者更准确地说,我的问题是:我如何找到非基于数组的二进制堆的最后一个元素?".

引用抑制火:

<块引用>

有没有你所处的环境问这个问题?(即,是否有你试图解决的一些具体问题解决?)

如上所述,我想知道一种找到非基于数组的二进制堆的最后一个元素"的好方法,这是插入和删除节点所必需的.

引用罗伊的话:

<块引用>

对我来说似乎最容易理解只需使用普通的二叉树结构(使用 pRoot 和 Node定义为 [data, pLeftChild,pRightChild]) 并添加两个额外的指针(pInsertionNode 和pLastNode).pInsertionNode 和pLastNode 都将在插入和删除子程序使它们在数据更新时保持最新内部结构发生变化.这给 O(1) 访问两个插入点和结构的最后一个节点.

是的,这应该可行.如果我没记错的话,当它们的位置由于删除/插入而改变到另一个子树时,找到插入节点和最后一个节点可能有点棘手.但我会试试这个.

引用 Zach Scrivena:

<块引用>

如何执行深度优先搜索...

是的,这将是一个很好的方法.我也试试看

我仍然想知道,是否有一种方法可以计算"最后一个节点和插入点的位置.可以通过取大于 N 的 2 的最小幂的对数(以 2 为底)来计算具有 N 个节点的二叉堆的高度.也许也可以计算最深级别的节点数.那么就有可能确定如何遍历堆才能到达插入点或删除节点.

解决方案

引用的语句基本上是指解决数据元素插入和删除堆的位置问题.为了保持二叉堆的形状属性",堆的最低层必须始终从左到右填充,不留空节点.为了保持二叉堆的平均 O(1) 插入和删除时间,您必须能够确定下一次插入的位置以及用于删除根节点的最低级别上最后一个节点的位置,两者都在恒定时间内.

对于存储在数组中的二进制堆(其隐式的、压缩的数据结构如维基百科条目中所述),这很容易.只需在数组末尾插入最新的数据成员,然后将其冒泡"到位(遵循堆规则).或者用数组冒泡"中的最后一个元素替换根以进行删除.对于数组存储中的堆,堆中元素的数量是一个隐式指针,指向下一个数据元素将被插入到哪里以及在哪里找到最后一个用于删除的元素.

对于存储在树结构中的二叉堆,这个信息没有那么明显,但是因为是完全二叉树,所以可以计算出来.例如,在具有 4 个元素的完整二叉树中,插入点将始终是根节点左孩子的右孩子.用于删除的节点将始终是根节点的左孩子的左孩子.对于任何给定的任意树大小,树将始终具有特定形状,并具有明确定义的插入和删除点.因为树是完全二叉树",对于任何给定的大小都有特定的结构,所以很可能在 O(1) 时间内计算插入/删除的位置.然而,问题是即使您知道它在结构上的位置,您也不知道该节点在内存中的位置.因此,您必须遍历树才能到达给定节点,该节点是一个 O(log n) 过程,使所有插入和删除操作最少为 O(log n),打破了通常所需的 O(1) 行为.由于提到的遍历问题,任何搜索(深度优先"或其他搜索)也将至少为 O(log n),并且由于半排序堆的随机性,通常为 O(n).

诀窍是通过增加数据结构(线程化"树,如在维基百科文章中提及)或使用额外的指针.

在我看来最容易理解的实现是只使用普通的简单二叉树结构(使用定义为 [data, pParent, pLeftChild, pRightChild]) 并添加两个额外的指针(pInsert 和 pLastNode).pInsert 和 pLastNode 都将在插入和删除子例程期间更新,以在结构中的数据更改时保持它们最新.此实现使 O(1) 访问结构的插入点和最后一个节点,并应允许在插入和删除中保留整体 O(1) 行为.实现的成本是插入/删除子例程中的两个额外指针和一些次要的额外代码(也就是最小的).

EDIT:为 O(1) insert() 添加了伪代码

这是一个插入子程序的伪代码,平均为 O(1):

define Node = [T data, *pParent, *pLeft, *pRight]无效插入(T数据){做插入(数据);//插入,更新树中数据项的计数# 假设:pInsert 指向刚刚发生插入的树的节点位置#(也就是,要么在插入过程中只混洗数据,要么在冒泡过程中保持 pInsert 更新)int N = this->CountOfDataItems + 1;# 注意:CountOfDataItems 将总是 >插入后为 0(和 pRoot != null)p = new Node( <null>, null, null, null);//下一次插入的新空节点# 更新 pInsert(三种情况需要处理)if ( int(log2(N)) == log2(N) ){# #1 - N 是 2 的精确幂# O(log2(N))# tree 目前是一个完整的完全二叉树(完美")# ... 必须开始一个新的较低级别# 从 pRoot 向下遍历每个 pLeft 直到找到空 pLeft 进行插入pInsert = pRoot;while (pInsert->pLeft != null) { pInsert = pInsert->pLeft;} # log2(N) 次迭代p->pParent = pInsert;pInsert->pLeft = p;}否则如果 ( isEven(N) ){# #2 - N 是偶数(而不是 2 的幂)# O(1)p->pParent = pInsert->pParent;pInsert->pParent->pRight = p;}别的{# #3 - N 是奇数# O(1)p->pParent = pInsert->pParent->pParent->pRight;pInsert->pParent->pParent->pRight->pLeft = p;}pInsert = p;//更新 pLastNode//... [类似过程]}

因此,insert(T) 平均为 O(1):在所有情况下都为 O(1),除非当树为 O(log N) 时必须增加一级,这种情况每 log N 次插入都会发生(假设没有删除).添加另一个指针 (pLeftmostLeaf) 可以使 insert() O(1) 对于所有情况并避免交替插入 & 的可能病理情况.完全完全二叉树中的删除.(添加 pLeftmost 留作练习 [这相当容易].)

quoting Wikipedia:

It is perfectly acceptable to use a traditional binary tree data structure to implement a binary heap. There is an issue with finding the adjacent element on the last level on the binary heap when adding an element which can be resolved algorithmically...

Any ideas on how such an algorithm might work?

I was not able to find any information about this issue, for most binary heaps are implemented using arrays.

Any help appreciated.


Recently, I have registered an OpenID account and am not able to edit my initial post nor comment answers. That's why I am responding via this answer. Sorry for this.


quoting Mitch Wheat:

@Yse: is your question "How do I find the last element of a binary heap"?

Yes, it is. Or to be more precise, my question is: "How do I find the last element of a non-array-based binary heap?".

quoting Suppressingfire:

Is there some context in which you're asking this question? (i.e., is there some concrete problem you're trying to solve?)

As stated above, I would like to know a good way to "find the last element of a non-array-based binary heap" which is necessary for insertion and deletion of nodes.

quoting Roy:

It seems most understandable to me to just use a normal binary tree structure (using a pRoot and Node defined as [data, pLeftChild, pRightChild]) and add two additional pointers (pInsertionNode and pLastNode). pInsertionNode and pLastNode will both be updated during the insertion and deletion subroutines to keep them current when the data within the structure changes. This gives O(1) access to both insertion point and last node of the structure.

Yes, this should work. If I am not mistaken, it could be a little bit tricky to find the insertion node and the last node, when their locations change to another subtree due to an deletion/insertion. But I'll give this a try.

quoting Zach Scrivena:

How about performing a depth-first search...

Yes, this would be a good approach. I'll try that out, too.

Still I am wondering, if there is a way to "calculate" the locations of the last node and the insertion point. The height of a binary heap with N nodes can be calculated by taking the log (of base 2) of the smallest power of two that is larger than N. Perhaps it is possible to calculate the number of nodes on the deepest level, too. Then it was maybe possible to determine how the heap has to be traversed to reach the insertion point or the node for deletion.

解决方案

Basically, the statement quoted refers to the problem of resolving the location for insertion and deletion of data elements into and from the heap. In order to maintain "the shape property" of a binary heap, the lowest level of the heap must always be filled from left to right leaving no empty nodes. To maintain the average O(1) insertion and deletion times for the binary heap, you must be able to determine the location for the next insertion and the location of the last node on the lowest level to use for deletion of the root node, both in constant time.

For a binary heap stored in an array (with its implicit, compacted data structure as explained in the Wikipedia entry), this is easy. Just insert the newest data member at the end of the array and then "bubble" it into position (following the heap rules). Or replace the root with the last element in the array "bubbling down" for deletions. For heaps in array storage, the number of elements in the heap is an implicit pointer to where the next data element is to be inserted and where to find the last element to use for deletion.

For a binary heap stored in a tree structure, this information is not as obvious, but because it's a complete binary tree, it can be calculated. For example, in a complete binary tree with 4 elements, the point of insertion will always be the right child of the left child of the root node. The node to use for deletion will always be the left child of the left child of the root node. And for any given arbitrary tree size, the tree will always have a specific shape with well defined insertion and deletion points. Because the tree is a "complete binary tree" with a specific structure for any given size, it is very possible to calculate the location of insertion/deletion in O(1) time. However, the catch is that even when you know where it is structurally, you have no idea where the node will be in memory. So, you have to traverse the tree to get to the given node which is an O(log n) process making all inserts and deletions a minimum of O(log n), breaking the usually desired O(1) behavior. Any search ("depth-first", or some other) will be at least O(log n) as well because of the traversal issue noted and usually O(n) because of the random nature of the semi-sorted heap.

The trick is to be able to both calculate and reference those insertion/deletion points in constant time either by augmenting the data structure ("threading" the tree, as mention in the Wikipedia article) or using additional pointers.

The implementation which seems to me to be the easiest to understand, with low memory and extra coding overhead, is to just use a normal simple binary tree structure (using a pRoot and Node defined as [data, pParent, pLeftChild, pRightChild]) and add two additional pointers (pInsert and pLastNode). pInsert and pLastNode will both be updated during the insertion and deletion subroutines to keep them current when the data within the structure changes. This implementation gives O(1) access to both insertion point and last node of the structure and should allow preservation of overall O(1) behavior in both insertion and deletions. The cost of the implementation is two extra pointers and some minor extra code in the insertion/deletion subroutines (aka, minimal).

EDIT: added pseudocode for an O(1) insert()

Here is pseudo code for an insert subroutine which is O(1), on average:

define Node = [T data, *pParent, *pLeft, *pRight]

void insert(T data)
{
    do_insertion( data );   // do insertion, update count of data items in tree

    # assume: pInsert points node location of the tree that where insertion just took place
    #   (aka, either shuffle only data during the insertion or keep pInsert updated during the bubble process)

    int N = this->CountOfDataItems + 1;     # note: CountOfDataItems will always be > 0 (and pRoot != null) after an insertion

    p = new Node( <null>, null, null, null);        // new empty node for the next insertion

    # update pInsert (three cases to handle)
    if ( int(log2(N)) == log2(N) )
        {# #1 - N is an exact power of two
        # O(log2(N))
        # tree is currently a full complete binary tree ("perfect")
        # ... must start a new lower level
        # traverse from pRoot down tree thru each pLeft until empty pLeft is found for insertion
        pInsert = pRoot;
        while (pInsert->pLeft != null) { pInsert = pInsert->pLeft; }    # log2(N) iterations
        p->pParent = pInsert;
        pInsert->pLeft = p;
        }
    else if ( isEven(N) )
        {# #2 - N is even (and NOT a power of 2)
        # O(1)
        p->pParent = pInsert->pParent;
        pInsert->pParent->pRight = p;
        }
    else 
        {# #3 - N is odd
        # O(1)
        p->pParent = pInsert->pParent->pParent->pRight;
        pInsert->pParent->pParent->pRight->pLeft = p;
        }
    pInsert = p;

    // update pLastNode
    // ... [similar process]
}

So, insert(T) is O(1) on average: exactly O(1) in all cases except when the tree must be increased by one level when it is O(log N), which happens every log N insertions (assuming no deletions). The addition of another pointer (pLeftmostLeaf) could make insert() O(1) for all cases and avoids the possible pathologic case of alternating insertion & deletion in a full complete binary tree. (Adding pLeftmost is left as an exercise [it's fairly easy].)

这篇关于查找二叉堆的最后一个元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆