如果更多重复键,快速排序算法改进 [英] quick sort algorithm improvement if more duplicate keys

查看:126
本文介绍了如果更多重复键,快速排序算法改进的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读Robert Sedwick的算法和数据结构中的快速排序算法第1-4部分。

  template< class item> 

static void quicksort(item [] a,int l,int r)
{
if(r <= l)return;
int i = partition(a,l,r);
quickSort(a,l,i-1);
quicksort(a,i + 1,r);
}

模板< class item>
int partition(item a [],int l,int r)
{
int i = 1-1; j = r; item v = a [r];

for(;;){
while(a [++ i]< v);
while(v if(j == l)
break;
if(i> = j)
break; //指针交叉。
exch(a [i],a [j]);
}

exch(a [i],a [r]);
return i;
}

Book有以上算法的下列文本。


当文件中存在重复的键时,指针交叉是
。我们可以稍微改进分区过程
,当i < j,然后使用j,而不是i-1,
来为第一个递归
调用定界左子文件的右端。在这种情况下让循环重复一次是一个
改进,因为当扫描循环终止与j
和我指的是同一个元素,我们结束了两个元素在
它们的最终位置:停止两次扫描的元素,因此必须
等于分区元素,而分区
元素本身。这种改变可能是值得的,因为在这个
特定情况下,程序留下一个记录,其中的键等于
在[r]中的分区键,并使得第一个分区在
call quick-sort(a,i + 1,r)退化,因为它的最右键是
它最小。



$ b b

我的问题是


  1. 我们如何修改上面的程序,

  2. 如果存在更多重复的键,上述快速排序算法无法有效工作。

  3. 如何

  4. 作者的意思是他的第一个分区在调用quick-sort(a,i + 1,r)退化,因为它的权利最关键的是它最小。 ?
    的作者是什么意思是在这里退化?

感谢您的时间和帮助。

>>如果存在更多重复的键,为什么上述快速排序算法无法有效工作?

< br>
因为你的破坏条件是: if(i> = j)break;

因此,从 i j 的两边,这是很可能你打破 i == j ,而不是让 i 超过 j


> $ b> 时,我们在 i == j
$ b

当你从第一个while循环中取出 i == j; 时,你必须有 a [i] ; = v 和从第二个while循环 a [j] <= v ,但是因为我们正在考虑一个'break'for: i == j so, a [i] = a [j] = v ie a [i ] 枢纽元素相同 v



<在这种情况下,你最外面的 exch(a [i],a [r]); 将简单地交换枢轴值到自身。

因此,在下一个递归调用 quicksort(a,i + 1,r); 右侧数组中,元素坐在最右端(你的枢轴选择策略简单, item v = a [r]; ),我们都知道QuickSort选择一个枢轴是不好的元素,其等于阵列的最小值或最大值。因此,您之后对于右半边的递归调用将是一个退化的。

这就是为什么作者建议不要破解 i == j 但在这之前抓住他们。



>>作者的意思是什么在这里退化?



递归树变得歪斜,即,后续问题不是由几乎相等的大小产生的。
您正在将大小 N 的问题分解为大小 N-1 1 N / 2的问题,而不是更平衡的东西, 2



>>我们如何使用下面的描述修改上述程序?



我们可以实现它像下面这样:

  int partition(int A [],int l,int r){
int i = l-1,j = r,v = A [r];
for(;;){
while(A [++ i]< v);
while(A [ - j]> v)
if(j == l)
break;
if(i> = j)
break;
swap(A [i],A [j]);
}
if(i == j){//当我们在pivot元素处停止时的情况。
j = j + 1; // backtrack j 1 step。
if(j <= r)
swap(A [j],A [r]);
return j; //现在分区j上的后续问题。
}
swap(A [i],A [r]);
return i;
}

>>如果有更多的复制键,

通过让您不产生明显的退化案例情况,提高了性能。


I am reading quick sort algorithm in book by Robert Sedwick Algorithms and data structures part 1-4.

template <class item>

static void quicksort(item [] a, int l, int r)
{
    if(r <= l) return;
    int i = partition(a,l,r);
    quicksort(a, l, i-1);
    quicksort(a, i+1, r);
}

template <class item>
int partition(item a[], int l, int r)
{
    int i = l-1; j = r; item v = a[r];

    for(;;) {
        while( a[++i] < v );
        while( v < a[--j] ) 
            if( j == l ) 
                break;
        if( i >= j) 
            break;  // pointer crossing.
        exch(a[i], a[j]);
    }

    exch(a[i], a[r]);
    return i;
}

Book has following text on above algorithm.

When duplicate keys are present in the file, the pointer crossing is subtle. we could improve the partitioning process slightly by terminating the scans when i < j, and then using j, rather than i-1, to delimit the right end of the left subfile for the first recursive call. Letting the loop iterate one more time in this case is an improvement, because, when ever the scanning loops terminate with j and i referring to the same element, we end up with two elements in their final positions: the element that stopped both scans, which must therefore be equal to the partitioning element, and the partitioning element itself. This change is probably worth making, because, in this particular case, the program leaves a record with a key equal to the partitioning key in a[r], and that makes the first partition in the call quick-sort(a, i+1, r) degenerate, because its right most key is its smallest.

My questions are

  1. How can we modify above program with description below? I am having difficulty in modifying it to understand text.
  2. Why above quick sort algorithm does not work effectively if more duplicate keys are present.
  3. How above modification improve if more duplication keys are present?
  4. What does author mean by "he first partition in the call quick-sort(a, i+1, r) degenerate, because its right most key is its smallest." ? What does author mean by degenerate here?

Thanks for your time and help.

解决方案

>>Why above quick sort algorithm does not work effectively if more duplicate keys are present?

It becomes inefficient because your breaking condition is: if(i >= j) break;
so, as you scan from both sides with i and j, It is quite possible that you break when i == j instead of letting i surpass over j.

What bad could possibly happen when we break on i==j when many duplicate keys are present ?

When you break for i==j; from first while loop you must have had a[i] >= v and from second while loop a[j] <=v but since we are considering a 'break' for: i==j so, a[i] = a[j] = v i.e. a[i] is same as v, your pivot element.

In such a scenario, your outermost exch(a[i], a[r]); will simply exchange pivot value to itself.
Hence, in your next recursive call quicksort(a, i+1, r); for Right-half of the array, you would have minimum element sitting at the rightmost end.( your pivot choosing strategy is simply, item v = a[r]; ) and we all know it is bad for QuickSort to choose a pivot element which amounts to the minimum or the maximum of the array. Hence, your subsequent recursive call for right-half will be a degenerate one.
That is why author is advising not to break for i==j but catch them just before that happens.

>>What does author mean by degenerate here?

Degenerate here means, the recursion tree is getting skewed i.e. the subsequent problems are not being generated of nearly equal sizes. You are dividing a problem of size N into something like problems of size N-1 and 1 instead of something more balanced, like dividing it into problems of size N/2 and N/2.

>>How can we modify above program with description below?

We could implement it like following:

int partition(int A[], int l, int r){
        int i=l-1, j=r, v = A[r];
        for(;;){
                while(A[++i] < v);
                while(A[--j] > v)
                        if(j == l)
                                break;
                if(i>=j)
                        break;
                swap(A[i], A[j]);
        }
        if(i == j){// case when we stopped at the pivot element.
                j = j+1;//backtrack j 1 step.
                if(j <= r)
                    swap(A[j], A[r]);
                return j;// partition the subsequent problems around j now.
        }
        swap(A[i], A[r]);
        return i;
}

>>How above modification improve if more duplication keys are present?
It improves the performance by letting you NOT generate an obvious scenario of a degenerate case.

这篇关于如果更多重复键,快速排序算法改进的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆