为什么合并排序更适合大数组,而快速排序适合小数组? [英] Why is Merge sort better for large arrays and Quick sort for small ones?

查看:82
本文介绍了为什么合并排序更适合大数组,而快速排序适合小数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为使用合并排序而不是快速排序的唯一原因是列表是否已经(或大部分)排序.

The only reason I see for using merge sort over quick sort is if the list was already (or mostly) sorted.

归并排序需要更多空间,因为它会创建一个额外的数组用于存储,并且不管它会比较每个项目.

Merge sort requires more space as it creates an extra array for storing, and no matter what it will compare every item.

另一方面,快速排序不需要额外的空间,并且不会进行不必要的交换或比较.

Quick sort on the other hand does not require extra space, and doesn't swap or compare more than necessary.

因为数据集大或小而说一个比另一个更好似乎是不直观的.

It would seem unintuitive to say that because of large or small data sets one is better than the other.

例如,引用 Geeksforgeeks 的文章:

For example, quoting Geeksforgeeks article on that:

归并排序适用于任何类型的数据集,无论其大小(大或小).然而快速排序不适用于大型数据集.

Merge sort can work well on any type of data sets irrespective of its size (either large or small). whereas The quick sort cannot work well with large datasets.

接下来它说:

合并排序没有到位,因为它需要额外的内存空间来存储辅助数组.然而快速排序已经到位,因为它不需要任何额外的存储空间.

Merge sort is not in place because it requires additional memory space to store the auxiliary arrays. whereas The quick sort is in place as it doesn’t require any additional storage.

我知道空间复杂度和时间复杂度是分开的.但这仍然是一个额外的步骤,当然,将所有内容写入具有大数据集的新数组需要更多时间.

I understand that space complexity and time complexity are separate things. But it still is an extra step, and of course the fact that writing everything on a new array with large data sets it would take more time.

至于旋转问题,数据集越大,选择最低或最高项目的机会就越低(除非它再次是一个几乎排序的列表).

As for the pivoting problem, the bigger the data set, the lower the chance of picking the lowest or highest item (unless, again, it's an almost sorted list).

那么为什么认为归并排序比快速排序更适合对大数据集进行排序?

So why is it considered that merge sort is a better way of sorting large data sets instead of quick sort?

推荐答案

为什么合并排序更适合大数组,而快速排序适合小数组?说由于数据集大或小,一个比另一个更好,这似乎是不直观的.

Why is Merge sort better for large arrays and Quick sort for small ones? It would seem unintuitive to say that because of large or small data sets one is better than the other.

假设数据集适合内存(未调出),问题不在于数据集的大小,而是特定快速排序实现的最坏情况模式,导致 O(n2) 时间复杂度.快速排序可以使用中位数来保证最坏情况下的时间复杂度为 O(n log(n)),但这最终使其比合并排序慢得多.如果递归级别太深,另一种方法是切换到堆排序,称为介绍排序,并在某些库中使用.

Assuming the dataset fits in memory (not paged out), the issue is not the size of the dataset, but a worst case pattern for a particular implementation of quick sort that result in O(n2) time complexity. Quick sort can use median of medians to guarantee worst case time complexity is O(n log(n)), but that ends up making it significantly slower than merge sort. An alternative is to switch to heap sort if the level of recursion becomes too deep, known as intro sort, and is used in some libraries.

https://en.wikipedia.org/wiki/Median_of_medians

https://en.wikipedia.org/wiki/Introsort

归并排序需要更多空间,因为它会创建一个额外的数组用于存储,并且不管它会比较每个项目.

Merge sort requires more space as it creates an extra array for storing, and no matter what it will compare every item.

有些合并排序的变体不需要任何额外的数据存储空间,但它们往往比标准合并排序慢 50% 以上.

There are variations of merge sort that don't require any extra storage for data, but they tend to be about 50+% slower than standard merge sort.

另一方面,快速排序不需要额外的空间,并且不会进行不必要的交换或比较.

Quick sort on the other hand does not require extra space, and doesn't swap or compare more than necessary.

子数组的每个元素都将与枢轴元素进行比较.随着相等元素数量的增加,Lomuto 分区方案变得更糟,而 Hoare 分区方案变得更好.由于有很多相等的元素,Hoare 分区方案会不必要地交换相等的元素,但避免交换的检查通常比仅仅交换花费更多的时间.

Every element of a sub-array will be compared to the pivot element. As the number of equal elements increases, Lomuto partition scheme gets worse, while Hoare partition scheme gets better. With a lot of equal elements, Hoare partition scheme needlessly swaps equal elements, but the check to avoid the swaps generally costs more time than just swapping.

对指向对象的指针数组进行排序

sorting an array of pointers to objects

合并排序比快速排序移动更多,但比较更少.如果对指向对象的指针数组进行排序,则仅移动指针,但比较对象需要尊重指针以及比较对象所需的内容.在这种情况下以及比较花费比移动更多时间的其他情况下,归并排序更快.

Merge sort does more moves but fewer compares than quick sort. If sorting an array of pointers to objects, only the pointers are being moved, but comparing objects requires deference of the pointers and what is needed to compare objects. In this case and other cases where compare takes more time than moves, merge sort is faster.

不适合内存的大型数据集

large datasets that don't fit in memory

对于太大而无法放入内存的数据集,使用基于内存的排序来对块"进行排序.将适合内存然后写入外部存储的数据集.然后是块"使用 k-way 合并合并外部存储上的数据以生成排序数据集.

For datasets too large to fit in memory, a memory base sort is used to sort "chunks" of the dataset that will fit into memory then written to external storage. Then the "chunks" on external storage are merged using a k-way merge to produce a sorted dataset.

https://en.wikipedia.org/wiki/External_sorting

这篇关于为什么合并排序更适合大数组,而快速排序适合小数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆