堆插入的 O(1) 平均情况复杂度的参数 [英] Argument for O(1) average-case complexity of heap insertion

查看:17
本文介绍了堆插入的 O(1) 平均情况复杂度的参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Wikipedia page for binary heaps 声称插入是 O(logn) 在最坏的情况下,但平均为 O(1):

The claim on the Wikipedia page for binary heaps is that insertion is O(log n) in the worst case, but O(1) on average:

所需的操作数量仅取决于新元素必须上升的层数以满足堆属性,因此插入操作的最坏情况时间复杂度为 O(log n)但平均情况复杂度为 O(1).

The number of operations required depends only on the number of levels the new element must rise to satisfy the heap property, thus the insertion operation has a worst-case time complexity of O(log n) but an average-case complexity of O(1).

链接页面试图证明这一点如下:

但是,平均而言,新插入的元素不会沿树向上移动很远.特别是,假设密钥分布均匀,它有二分之一的机会大于其父项;鉴于它大于其父母,它有二分之一的机会大于其祖父母;鉴于它大于其父代,它有二分之一的机会大于其曾祖父母,依此类推[...],因此在平均情况下,插入需要恒定的时间

However, on average, the newly inserted element does not travel very far up the tree. In particular, assuming a uniform distribution of keys, it has a one-half chance of being greater than its parent; it has a one-half chance of being greater than its grandparent given that it is greater than its parent; it has a one-half chance of being greater than its great-grandparent given that it is greater than its parent, and so on [...] so that in the average case insertion takes constant time

这肯定是胡说八道吧?在我看来,如果树是随机排序的,那么新元素大于其父元素的可能性为 50/50;但是因为,粗略地说,大元素沉到底部,随着堆的增长,机会远小于 50/50.

This is surely nonsense, though? It seems to me that if the tree were randomly ordered then there would be a 50/50 chance that a new element was greater than its parent; but that since, roughly speaking, the large elements sink to the bottom, the chances are much less than 50/50 as the heap grows.

是吗?

在维基百科上已经这样好几个月了......

It's been like that on Wikipedia for several months...

推荐答案

对于堆插入平均时间为 O(1) 的说法,有一个更好的参考:1991 年的论文堆构建的平均案例分析通过重复插入" by Hayward &麦克迪亚米德.(这篇论文链接在维基百科文章的当前参考文献 4 中.)该论文又引用了 Porter & 1975 年的一篇论文.Simon,随机插入优先级队列结构",处理单个插入到堆中,并证明平均情况是 O(1).

There is a much better reference for the claim that average time heap insertion is O(1): the 1991 paper "Average Case Analysis of Heap Building by Repeated Insertion" by Hayward & McDiarmid. (This paper is linked in what is currently reference 4 of the Wikipedia article.) That paper in turn references a 1975 paper by Porter & Simon, "Random insertion into a priority queue structure" which deals with a single insertion into a heap, and demonstrates that the average case is O(1).

直观地说,这个论点很简单.堆的一半是叶子,叶子往往更大.如果我们暂时假设叶子是堆中最大的元素(而不是趋于更大),那么我们可以说新元素是叶子的概率——即它在上半部分值的范围——正好是 0.5.如果新元素不是堆的叶子(概率也是 0.5),我们可以用截断的堆重复这个过程,该堆只包含原始堆中的非叶子节点,所以新元素在第二个的概率-最低水平将是剩余水平的一半:0.25.因此,它处于第三级的概率为 0.125,依此类推.那么我们必须搜索的预期级别数将是 1*0.5 + 2*0.25 + 3*0.125...,即 2.

Intuitively, the argument is simple. Half of the heap is a leaf, and the leaves tend to be larger. If we assume for a moment that leaves are the largest elements in the heap (rather than tending to be larger), then we could say that the probability that a new element will be a leaf -- i.e., that it is in the upper half of the range of values -- would be exactly 0.5. If the new element is not a leaf of the heap (also probability 0.5), we can repeat the process with the truncated heap consisting only of non-leaf nodes in the original heap, so the probability that the new element is at the second-lowest level will be half of what remains: 0.25. And thus the probability that it is at the third level would be 0.125, and so on. Then the expected number of levels we would have to search through would be 1*0.5 + 2*0.25 + 3*0.125..., which is 2.

当然,随机新元素大于随机二级父元素的概率并不是真正的0.5;它实际上少了一点.但是,只要它以常数为界,计算预期比较次数的幂级数之和也将受到常数的限制.事实证明,常数在 2.6 左右.

Of course, the probability that the random new element is greater than a random second-level parent is not really 0.5; it's actually a little less. But, as long as it is bounded by a constant, the sum of the power series which computes the expected number of comparisons will still also be bounded by a constant. And it turns out that the constant is around 2.6.

另请参阅这个有用的答案,在讨论堆的复杂性并将它们与 BST 的复杂性进行对比时,给出了一个堆中恒定平均插入时间的详细图形分析.

Also see this useful answer that, while discussing complexities of heaps, and contrasting them with those of BST, gives a detailed graphical analysis of the constant average insertion time in heaps.

这篇关于堆插入的 O(1) 平均情况复杂度的参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆