如何优化快速排序 [英] How to optimize quicksort

查看:90
本文介绍了如何优化快速排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制定一种有效的快速排序算法。它可以正常工作,但是当元素数量巨大且数组的某些部分已预先排序时,运行时间会很长。我在 quicksort 上查找Wikipedia文章,并在其中找到了这样的文字:


要确保最多使用O(log N)空间,请先递归到数组的较小一半,然后使用tail调用递归到另一个数组。



对于这样的小数组(即,其长度小于通过实验确定的阈值t),使用具有较小常数因子并因此在小数组上更快的插入排序。这可以通过不对此类数组进行排序并在末尾运行一次插入排序遍历来实现,因为插入排序可有效处理几乎排序的数组。识别每个小片段时,单独的插入排序会增加启动和停止许多小排序的开销,但避免了浪费精力跨多个片段边界比较键,由于快速排序过程的工作,这些键是有序的。


我目前正在为两个分区递归。任何想法如何实施第一个技巧? 先递归到数组的较小部分,然后使用尾部调用递归到另一个是什么意思?其次,如何在quicksort中实现 inserttion-sort ?它会始终提高效率,还是仅在对数组的某些部分进行预排序时才提高效率?如果是第二种情况,那么我当然不知道何时会发生。所以什么时候应该在 inserttion-sort 中添加?

解决方案

-sort,您选择了一个随机的枢轴,该枢轴将数组定界为两个数组的一半,大多数情况下可能会更小,



例如数组大小为100,枢轴将数组定界为40/60,较小的数组为40。



让我们假设您决定阈值大小来使用插入排序为10,
,您需要继续通过数据透视表递归拆分数组,每当一半的其中一个变得小于或等于10时,您可以在小尺寸数组上使用类似于O(n)的插入排序。 / p>

请注意,如果对数组进行反向排序(最坏的情况),插入排序将表现不佳。



关于递归方面,您只需要修改快速排序递归的停止条件->数组大小<== 10停止递归,并使用插入排序对所有数组(在此递归步骤中要小得多)进行排序。



通过尾递归,它们的意思是在前半部分执行您需要的所有操作,然后将较小的一半插入插入排序作为最后一个方法,用于节省空间。

 快速排序()
选择一个枢轴
将较小的元素从左移
将较大的元素从右移
快速排序在数组较大的一半上

如果一半小于X
然后在另一半上执行插入排序<-这是尾递归插入排序
否则在另一半上
快速排序也是

据我所见,第二个优化建议不要对每个递归步骤都使用插入排序,但要记住对其进行约束的索引,然后在一批中调用插入排序以将所有切片中的项目连接在一起,这样可以确保提高缓存的使用率,但是实现起来稍微困难一些,


I am trying to work out an efficient quicksort algo. It works okay, but takes long time to run when the number of elements are huge, and certain sections of the array are pre-sorted. I was looking up the Wikipedia article on quicksort, and there I found this written:

To make sure at most O(log N) space is used, recurse first into the smaller half of the array, and use a tail call to recurse into the other.

Use insertion sort, which has a smaller constant factor and is thus faster on small arrays, for invocations on such small arrays (i.e. where the length is less than a threshold t determined experimentally). This can be implemented by leaving such arrays unsorted and running a single insertion sort pass at the end, because insertion sort handles nearly sorted arrays efficiently. A separate insertion sort of each small segment as they are identified adds the overhead of starting and stopping many small sorts, but avoids wasting effort comparing keys across the many segment boundaries, which keys will be in order due to the workings of the quicksort process. It also improves the cache use.

I am currently recursing for both partitions. Any idea how to implement the first tip? What is meant by recurse first into the smaller half of the array, and use a tail call to recurse into the other? And secondly, how can I implement an insertion-sort within quicksort? Will it always improve the efficiency, or only when certain sections of the array are pre-sorted? If it is the 2nd case, then of course I have no way of knowing when will that occur. So when should I include the insertion-sort?

解决方案

In Quick-sort , you choose a random pivot that delimits the array to two half's, most of the chances that one may be smaller,

e.g. Array size 100, pivot delimits the array to 40 / 60, the 40 is the the smaller size.

Lets assume that you decide on your threshold size to use the insertion sort to be 10, you need to continue recursively split the array by pivot , whenever one of the half's become smaller or equal to 10, you may use the insertion sort that behaves like O(n) on small size arrays.

Take into account that insertion sort will behave badly if your array is sorted reversely (worst case).

As regards to the recursion stuff, you only need to modify the stop case of the quick-sort recursion -> array size <= 10 stop recursion and sort the all the array (which is much smaller in this recursion step) using insertion sort.

By tail recursion , they mean do everything you need with the first half, and then invoke insertion sort for the smaller half as a last method , it is used to save space.

  Quick-sort()
      choose a pivot
      move the smaller elements from left
      move the bigger elements from right
      quick-sort on the bigger half of the array

      if half is less then X
         only then do an insertion sort on the other half <- this is a tail recursion insertion sort 
      else
         quick sort on this half also

As far as i see , the second optimization suggest not to use insertion sort for every recursion step, but remember the indexes for which the constraint is made, then to invoke insertion sort in one batch concatenating the items from all the slices, this will insure improve the cache use , but is is slightly more difficult to implement,

这篇关于如何优化快速排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆