如何实现中值堆 [英] How to implement a Median-heap

查看:14
本文介绍了如何实现中值堆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

像最大堆和最小堆一样,我想实现一个中值堆来跟踪给定整数集的中值.API应具有以下三个功能:

Like a Max-heap and Min-heap, I want to implement a Median-heap to keep track of the median of a given set of integers. The API should have the following three functions:

insert(int)  // should take O(logN)
int median() // will be the topmost element of the heap. O(1)
int delmedian() // should take O(logN)

我想使用数组 (a) 实现来实现堆,其中数组索引 k 的子项存储在数组索引 2*k 和 2*k + 1 中.为方便起见,数组从索引 1 开始填充元素.这是我到目前为止:中值堆将有两个整数来跟踪到目前为止插入的整数的数量,它们是 > 当前中值 (gcm) 和 <;当前中值 (lcm).

I want to use an array (a) implementation to implement the heap where the children of array index k are stored in array indices 2*k and 2*k + 1. For convenience, the array starts populating elements from index 1. This is what I have so far: The Median-heap will have two integers to keep track of number of integers inserted so far that are > current median (gcm) and < current median (lcm).

if abs(gcm-lcm) >= 2 and gcm > lcm we need to swap a[1] with one of its children. 
The child chosen should be greater than a[1]. If both are greater, 
choose the smaller of two.

另一种情况类似.我想不出如何下沉和游泳元素的算法.我认为应该考虑数字与中位数的接近程度,例如:

Similarly for the other case. I can't come up with an algorithm for how to sink and swim elements. I think it should take into consideration how close the number is to the median, so something like:

private void swim(int k) {
    while (k > 1 && absless(k, k/2)) {   
        exch(k, k/2);
        k = k/2;
    }
}

虽然我想不出完整的解决方案.

I can't come up with the entire solution though.

推荐答案

您需要两个堆:一个最小堆和一个最大堆.每个堆包含大约一半的数据.最小堆中的每个元素都大于或等于中位数,而最大堆中的每个元素都小于或等于中位数.

You need two heaps: one min-heap and one max-heap. Each heap contains about one half of the data. Every element in the min-heap is greater or equal to the median, and every element in the max-heap is less or equal to the median.

当最小堆比最大堆多一个元素时,中位数在最小堆的顶部.当最大堆比最小堆多一个元素时,中位数在最大堆的顶部.

When the min-heap contains one more element than the max-heap, the median is in the top of the min-heap. And when the max-heap contains one more element than the min-heap, the median is in the top of the max-heap.

当两个堆包含相同数量的元素时,元素总数是偶数.在这种情况下,您必须根据您对中位数的定义进行选择: a) 两个中间元素的平均值;b) 两者中较大者;c) 较小者;d) 随机选择两者中的任何一个...

When both heaps contain the same number of elements, the total number of elements is even. In this case you have to choose according your definition of median: a) the mean of the two middle elements; b) the greater of the two; c) the lesser; d) choose at random any of the two...

每次插入时,将新元素与堆顶部的元素进行比较,以决定插入的位置.如果新元素大于当前中值,则进入最小堆.如果小于当前中位数,则进入最大堆.那么你可能需要重新平衡.如果堆的大小相差不止一个元素,则从具有更多元素的堆中提取最小值/最大值并将其插入另一个堆.

Every time you insert, compare the new element with those at the top of the heaps in order to decide where to insert it. If the new element is greater than the current median, it goes to the min-heap. If it is less than the current median, it goes to the max heap. Then you might need to rebalance. If the sizes of the heaps differ by more than one element, extract the min/max from the heap with more elements and insert it into the other heap.

为了为元素列表构造中值堆,我们应该首先使用线性时间算法并找到中值.一旦知道中位数,我们就可以根据中位数简单地将元素添加到最小堆和最大堆中.不需要平衡堆,因为中位数会将输入的元素列表分成相等的两半.

In order to construct the median heap for a list of elements, we should first use a linear time algorithm and find the median. Once the median is known, we can simply add elements to the Min-heap and Max-heap based on the median value. Balancing the heaps isn't required because the median will split the input list of elements into equal halves.

如果您提取一个元素,您可能需要通过将一个元素从一个堆移动到另一个来补偿大小变化.通过这种方式,您可以始终确保两个堆的大小相同或仅相差一个元素.

If you extract an element you might need to compensate the size change by moving one element from one heap to another. This way you ensure that, at all times, both heaps have the same size or differ by just one element.

这篇关于如何实现中值堆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆