三值中位数策略 [英] median of three values strategy

查看:30
本文介绍了三值中位数策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

快速排序中选择主元值的三种策略的中位数是多少?

What is the median of three strategy to select the pivot value in quick sort?

我正在网上阅读它,但我无法弄清楚它到底是什么?以及它如何优于随机快速排序.

I am reading it on the web, but I couldn't figure it out what exactly it is? And also how it is better than the randomized quick sort.

推荐答案

三的中位数让你查看数组的第一个、中间和最后一个元素,并选择这三个元素的中位数作为主元.

The median of three has you look at the first, middle and last elements of the array, and choose the median of those three elements as the pivot.

>

要获得三个中位数的完整效果",对这三个项目进行排序也很重要,而不仅仅是使用中位数作为主元——这不会影响选择的内容当前迭代中的主元,但可以/将影响下一次递归调用中用作主元的内容,这有助于限制一些初始排序的不良行为(在许多情况下特别糟糕的是数组已排序,除了在数组的高端具有最小元素(或在低端具有最大元素).例如:

To get the "full effect" of the median of three, it's also important to sort those three items, not just use the median as the pivot -- this doesn't affect what's chosen as the pivot in the current iteration, but can/will affect what's used as the pivot in the next recursive call, which helps to limit the bad behavior for a few initial orderings (one that turns out to be particularly bad in many cases is an array that's sorted, except for having the smallest element at the high end of the array (or largest element at the low end). For example:

与随机选取支点相比:

  1. 它确保一种常见情况(完全排序的数据)保持最佳状态.
  2. 更难操纵以给出最坏的情况.
  3. PRNG 通常相对较慢.

第二点可能需要更多解释.如果您使用明显的 (rand()) 随机数生成器,那么对于某些人来说,排列元素相当容易(无论如何)(对于许多情况),因此它会不断选择较差的枢轴.对于可能正在对潜在攻击者输入的数据进行排序的 Web 服务器之类的东西,这可能是一个严重的问题,他们可以通过让您的服务器浪费大量时间对数据进行排序来发起 DoS 攻击.在这种情况下,您可以使用真正随机的种子,或者您可以包含您自己的 PRNG 而不是使用 rand() -- 或者您使用三的中位数,这也有其他优点提到了.

That second point probably bears a bit more explanation. If you used the obvious (rand()) random number generator, it's fairly easy (for many cases, anyway) for somebody to arrange the elements so it'll continually choose poor pivots. This can be a serious concern for something like a web server that may be sorting data that's been entered by a potential attacker, who could mount a DoS attack by getting your server to waste a lot of time sorting the data. In a case like this, you could use a truly random seed, or you could include your own PRNG instead of using rand() -- or you use use Median of three, which also has the other advantages mentioned.

另一方面,如果您使用足够随机的生成器(例如,硬件生成器或计数器模式下的加密),与中位数相比, 更难强制出现坏情况三选.同时,达到这种随机性水平通常有相当多的开销,所以除非你真的希望在这种情况下受到攻击,否则它可能不值得(如果你这样做,它可能至少值得考虑保证 O(N log N) 最坏情况的替代方法,例如合并排序或堆排序.

On the other hand, if you use a sufficiently random generator (e.g., a hardware generator or encryption in counter mode) it's probably more difficult to force a bad case than it is for a median of three selection. At the same time, achieving that level of randomness typically has quite a bit of overhead of its own, so unless you really expect to be attacked in this case, it's probably not worthwhile (and if you do, it's probably worth at least considering an alternative that guarantees O(N log N) worst case, such as a merge sort or heap sort.

这篇关于三值中位数策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆