如何至K排序的阵列,与归并排序排序 [英] How to sort K sorted arrays, with MERGE SORT

查看:120
本文介绍了如何至K排序的阵列,与归并排序排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这个问题已经被问过了,还有用最小堆一个非常漂亮优雅的解决方案。

I know that this question has been asked, and there is a very nice elegant solution using a min heap.

我的问题是如何将使用合并排序的合并功能的一个做到这一点。

MY question is how would one do this using the merge function of merge sort.

您已经有排序数组的数组。所以,你应该能够合并所有的人到一个数组中的O(n日志K)的时间,正确的?

You already have an array of sorted arrays. So you should be able to merge all of them into one array in O(nlog K) time, correct?

我只是不知道如何做到这一点!

I just can't figure out how to do this!

说我有

[[5,6],[3,4],[1,2],[0]]

[ [5,6], [3,4], [1,2], [0] ]

步骤1:[3,4,5,6],[0,1,2]]

Step 1: [ [3,4,5,6], [0,1,2] ]

第二步:[0,1,2,3,4,5,6]]

Step2: [ [0,1,2,3,4,5,6] ]

有没有一种简单的方法来做到这一点?为O(n日志K)理论上可达到与归并排序?

Is there a simple way to do this? Is O(nlog K) theoretically achievable with mergesort?

推荐答案

正如其他人所说,使用最小堆举行下项目的最佳方式。这就是所谓的N路合并。它的复杂度为O(N日志K)。

As others have said, using the min heap to hold the next items is the optimal way. It's called an N-way merge. Its complexity is O(n log k).

您的可以的使用2路合并算法至K数组排序。也许最简单的方法就是修改标准合并排序,以便它使用非恒定的分区大小。例如,假设你有4阵列,长度10,8,12,和33。每个数组排序。如果您串联的数组合并为一个,你会这些分区(数字是索引到数组,而不是值):

You can use a 2-way merge algorithm to sort k arrays. Perhaps the easiest way is to modify the standard merge sort so that it uses non-constant partition sizes. For example, imagine that you have 4 arrays with lengths 10, 8, 12, and 33. Each array is sorted. If you concatenated the arrays into one, you would have these partitions (the numbers are indexes into the array, not values):

[0-9][10-17][18-29][30-62]

您合并排序的第一阶段将有0和10开始索引你可能会合并成一个新的数组,就像你用标准的合并排序。下传将开始在所述第二阵列中的位置18和30。当你用第二遍做,你的输出数组包含:

The first pass of your merge sort would have starting indexes of 0 and 10. You would merge that into a new array, just as you would with the standard merge sort. The next pass would start at positions 18 and 30 in the second array. When you're done with the second pass, your output array contains:

[0-17][18-62]

现在你的分区从0开始,18你合并这两个成一个数组,你就大功告成了。

Now your partitions start at 0 and 18. You merge those two into a single array and you're done.

唯一真正的区别是,而不是开始为2的分区大小和加倍,你有非恒定的分区大小。当你让每一个传球,新的分区大小是你在previous通所用的两个分区的大小的总和。这真的是标准的合并排序只是稍作修改。

The only real difference is that rather than starting with a partition size of 2 and doubling, you have non-constant partition sizes. As you make each pass, the new partition size is the sum of the sizes of the two partitions you used in the previous pass. This really is just a slight modification of the standard merge sort.

这将需要登录(k)的传递做了排序,并在每个传递,你看看所有n项。该算法是O(n日志k)的,但具有比N路合并高得多的恒定

It will take log(k) passes to do the sort, and at each pass you look at all n items. The algorithm is O(n log k), but with a much higher constant than the N-way merge.

有关实施,建立一个整数数组,其中包含每个你的子阵列的开始索引。因此,在上面的例子中,你将有:

For implementation, build an array of integers that contains the starting indexes of each of your sub arrays. So in the example above you would have:

int[] partitions = [0, 10, 18, 30];
int numPartitions = 4;

现在,你做你的标准归并排序。但你从分区数组中选择你的分区。所以,你的合并将开始与:

Now you do your standard merge sort. But you select your partitions from the partitions array. So your merge would start with:

merge (inputArray, outputArray, part1Index, part2Index, outputStart)
{
    part1Start = partitions[part1Index];
    part2Start = partitions[part2Index];

    part1Length = part2Start - part1Start;
    part2Length = partitions[part2Index-1] - part2Start;

    // now merge part1 and part2 into the output array,
    // starting at outputStart
}

和你的主循环看起来是这样的:

And your main loop would look something like:

while (numPartitions > 1)
{
    for (int p = 0; p < numPartitions; p += 2)
    {
        outputStart = partitions[p];
        merge(inputArray, outputArray, p, p+1, outputStart);
        // update partitions table
        partitions[p/2] = partitions[p] + partitions[p+1];
    }
    numPartitions /= 2;
}

这是基本的思路。你必须做一些工作来处理晃来晃去的分区时数为奇数,但总的来说这是它是如何做的。

That's the basic idea. You'll have to do some work to handle the dangling partition when the number is odd, but in general that's how it's done.

您也可以通过保持数组的数组,每个两个阵列合并成一个新的数组,加入做,要阵列的输出阵列。车床,漂洗,重复。

You can also do it by maintaining an array of arrays, and merging each two arrays into a new array, adding that to an output array of arrays. Lather, rinse, repeat.

这篇关于如何至K排序的阵列,与归并排序排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆