合并排序究竟进行了多少次比较? [英] Exactly how many comparisons does merge sort make?

查看:89
本文介绍了合并排序究竟进行了多少次比较?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经读到,快速排序实际上比合并排序要快得多,原因是隐藏常量.

I have read that quicksort is much faster than mergesort in practice, and the reason for this is the hidden constant.

好吧,随机快速排序复杂度的解决方案是2nlnn = 1.39nlogn,这意味着quicksort中的常数是1.39.

Well, the solution for the randomized quick sort complexity is 2nlnn=1.39nlogn which means that the constant in quicksort is 1.39.

但是mergesort呢?mergesort中的常量是什么?

But what about mergesort? What is the constant in mergesort?

推荐答案

让我们看看我们是否可以解决这个问题!

Let's see if we can work this out!

在合并排序中,在递归的每个级别,我们都执行以下操作:

In merge sort, at each level of the recursion, we do the following:

  1. 将数组分成两半.
  2. 递归地对每一半进行排序.
  3. 使用合并算法将两个部分合并在一起.

那么每个步骤要进行多少次比较?好吧,除法步骤不做任何比较;它只是将数组分成两半.步骤2不(直接)进行任何比较;所有比较都是通过递归调用完成的.在第3步中,我们有两个大小为n/2的数组,需要将它们合并.这最多需要进行n次比较,因为合并算法的每个步骤都会进行一次比较,然后消耗一些数组元素,因此我们最多只能进行n次比较.

So how many comparisons are done at each step? Well, the divide step doesn't make any comparisons; it just splits the array in half. Step 2 doesn't (directly) make any comparisons; all comparisons are done by recursive calls. In step 3, we have two arrays of size n/2 and need to merge them. This requires at most n comparisons, since each step of the merge algorithm does a comparison and then consumes some array element, so we can't do more than n comparisons.

将其组合在一起,可以得到以下重复出现:

Combining this together, we get the following recurrence:

C(1) = 0
C(n) = 2C(n / 2) + n

(如评论中所述,线性项更精确(n-1),尽管这不会改变整体结论.我们将上述重复作为上限.)

(As mentioned in the comments, the linear term is more precisely (n - 1), though this doesn’t change the overall conclusion. We’ll use the above recurrence as an upper bound.)

为简化此过程,让我们定义n = 2 k 并用k重写此递归:

To simplify this, let's define n = 2k and rewrite this recurrence in terms of k:

C'(0) = 0
C'(k) = 2C'(k - 1) + 2^k

这里的前几项是0、2、8、24,....这看起来像k 2 k ,我们可以通过归纳证明这一点.作为我们的基本情况,当k = 0时,第一项为0,并且k 2 k 的值也为0.对于归纳步​​骤,假设要求对k成立,并考虑k +1.然后值为2(k 2 k )+ 2 k +1 = k 2 k +1 + 2 k+ 1 =(k +1)2 k +1 ,因此索赔要求适用于k +1,从而完成了归纳.因此,C'(k)的值为k 2 k .由于n = 2 k ,这意味着,假设n是2的理想幂,我们可以进行比较的次数是

The first few terms here are 0, 2, 8, 24, ... . This looks something like k 2k, and we can prove this by induction. As our base case, when k = 0, the first term is 0, and the value of k 2k is also 0. For the inductive step, assume the claim holds for some k and consider k + 1. Then the value is 2(k 2k) + 2k + 1 = k 2 k + 1 + 2k + 1 = (k + 1)2k + 1, so the claim holds for k + 1, completing the induction. Thus the value of C'(k) is k 2k. Since n = 2 k, this means that, assuming that n is a perfect power of two, we have that the number of comparisons made is

C(n)= n lg n

C(n) = n lg n

令人印象深刻的是,这比quicksort更好!那么,为什么地球上的快速排序比合并排序要快呢?这与与比较次数无关的其他因素有关.首先,由于快速排序在适当的地方工作,而合并排序在不适当的地方工作,因此在合并排序中引用的位置几乎不像在快速排序中那样好.这是一个巨大的因素,实际上,快速排序比合并排序要好得多,因为高速缓存未命中的代价非常大.此外,对数组进行排序所需的时间不仅仅考虑比较次数.其他因素(例如每个数组元素的移动次数)也很重要.例如,在合并排序中,我们需要为缓冲的元素分配空间,移动元素以便可以合并它们,然后合并回数组.这些举动在我们的分析中并未计算在内,但肯定会加起来.将此结果与quicksort的分区步骤进行比较,该步骤将每个数组元素仅移动一次并停留在原始数组内.这些额外因素(而不是进行比较的次数)决定了算法的运行时间.

Impressively, this is better than quicksort! So why on earth is quicksort faster than merge sort? This has to do with other factors that have nothing to do with the number of comparisons made. Primarily, since quicksort works in place while merge sort works out of place, the locality of reference is not nearly as good in merge sort as it is in quicksort. This is such a huge factor that quicksort ends up being much, much better than merge sort in practice, since the cost of a cache miss is pretty huge. Additionally, the time required to sort an array doesn't just take the number of comparisons into account. Other factors like the number of times each array element is moved can also be important. For example, in merge sort we need to allocate space for the buffered elements, move the elements so that they can be merged, then merge back into the array. These moves aren't counted in our analysis, but they definitely add up. Compare this to quicksort's partitioning step, which moves each array element exactly once and stays within the original array. These extra factors, not the number of comparisons made, dominate the algorithm's runtime.

此分析比最佳分析的精度稍差一些,但 Wikipedia 确认分析大约为n lg n,并且比起quicksort的平均情况,确实比比较少.

This analysis is a bit less precise than the optimal one, but Wikipedia confirms that the analysis is roughly n lg n and that this is indeed fewer comparisons than quicksort's average case.

希望这会有所帮助!

这篇关于合并排序究竟进行了多少次比较?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆