对快速排序和合并排序进行基准测试,可以使合并排序更快 [英] Benchmarking quicksort and mergesort yields that mergesort is faster

查看:69
本文介绍了对快速排序和合并排序进行基准测试,可以使合并排序更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试进行基准测试,出于某种原因,当在100万个元素数组上尝试将它们同时使用时,Mergesort将其排序为0.3s,而Quicksort则为1.3s.

我听说,由于它的内存管理,通常quicksort更快,但是如何解释这些结果呢?

如果这有任何区别,我正在运行MacBook Pro.输入是一组从0到127的随机生成的整数.

代码使用Java:

MergeSort:

 static void mergesort(int arr[]) {
    int n = arr.length;
    if (n < 2)
        return;
    int mid = n / 2;
    int left[] = new int[mid];
    int right[] = new int[n - mid];
    for (int i = 0; i < mid; i++)
        left[i] = arr[i];
    for (int i = mid; i < n; i++)
        right[i - mid] = arr[i];
    mergesort(left);
    mergesort(right);
    merge(arr, left, right);
}

public static void merge(int arr[], int left[], int right[]) {
    int nL = left.length;
    int nR = right.length;
    int i = 0, j = 0, k = 0;
    while (i < nL && j < nR) {
        if (left[i] <= right[j]) {
            arr[k] = left[i];
            i++;
        } else {
            arr[k] = right[j];
            j++;
        }
        k++;
    }
    while (i < nL) {
        arr[k] = left[i];
        i++;
        k++;
    }
    while (j < nR) {
        arr[k] = right[j];
        j++;
        k++;
    }
}

 

快速排序:

public static void quickSort(int[] arr, int start, int end) {
    int partition = partition(arr, start, end);

    if (partition - 1 > start) {
        quickSort(arr, start, partition - 1);
    }
    if (partition + 1 < end) {
        quickSort(arr, partition + 1, end);
    }
}

public static int partition(int[] arr, int start, int end) {
    int pivot = arr[end];

    for (int i = start; i < end; i++) {
        if (arr[i] < pivot) {
            int temp = arr[start];
            arr[start] = arr[i];
            arr[i] = temp;
            start++;
        }
    }

    int temp = arr[start];
    arr[start] = pivot;
    arr[end] = temp;

    return start;
}

解决方案

您的实现有点简单:

  • mergesort在每个递归调用中分配2个新数组,这很昂贵,但是某些JVM在优化这种编码模式方面效率出乎意料.
  • quickSort使用的支点选择差,即子数组的最后一个元素,这给排序后的子数组(包括那些具有相同元素的子数组)提供了二次时间.

数据集是一个伪随机数在较小范围0..127中的数组,导致quickSort实现的缺点要比mergesort版本的效率低得多.增大数据集的大小应使其更加明显,甚至可能由于太多的递归调用而导致堆栈溢出.具有通用模式(例如相同的值,增加或减少的集合以及此类序列的组合)的数据集将导致quickSort实现的灾难性性能.

这里是经过稍微修改的版本,对枢轴(数组3/4处的元素)的病理选择较少,并且有一个循环可检测枢轴值的重复项,从而提高重复值数据集的效率.在只有40k个元素的数组的标准排序基准上,它的性能要好得多(100x),但比radixsort慢得多(8x):

 public static void quickSort(int[] arr, int start, int end) {
    int p1 = partition(arr, start, end);
    int p2 = p1;

    /* skip elements identical to the pivot */
    while (++p2 <= end && arr[p2] == arr[p1])
        continue;

    if (p1 - 1 > start) {
        quickSort(arr, start, p1 - 1);
    }
    if (p2 < end) {
        quickSort(arr, p2, end);
    }
}

public static int partition(int[] arr, int start, int end) {
    /* choose pivot at 3/4 or the array */
    int i = end - ((end - start + 1) >> 2);
    int pivot = arr[i];
    arr[i] = arr[end];
    arr[end] = pivot;

    for (i = start; i < end; i++) {
        if (arr[i] < pivot) {
            int temp = arr[start];
            arr[start] = arr[i];
            arr[i] = temp;
            start++;
        }
    }

    int temp = arr[start];
    arr[start] = pivot;
    arr[end] = temp;

    return start;
}
 


对于OP的数据集,假设分布具有良好的随机性,则扫描重复项可改善性能.如预期的那样,选择不同的枢轴,无论是第一,最后,中间,3/4或2/3或什至3的中间值都几乎没有影响.

在其他非随机分布上的进一步测试显示,由于选择了枢轴,因此此quickSort实现的灾难性性能.根据我的基准,通过选择在数组的3/4或2/3处旋转元素可以获得更好的性能(对于50k样本,其性能提高了300倍,比标准合并排序快40%,与radix_sort相当的时间). /p>

  • Mergesort具有对所有分布稳定且可预测的显着优势,但它需要额外的存储空间,其大小介于数据集大小的50%到100%之间.
  • 精心实现的Quicksort在许多情况下会更快一些,并且可以就地执行,只需要log(N)堆栈空间即可进行递归.但是它并不稳定,量身定制的发行版将表现出灾难性的性能,甚至可能崩溃.
  • Radixsort仅适用于特定类型的数据,例如整数和固定长度的字符串.它还需要额外的内存.
  • Countingsort对于OP的数据集而言是最有效的,因为它只需要一个128个整数数组即可计算不同值的出现次数,已知范围为0..127.对于任何分布,它将在线性时间内执行.

I've tried benchmarking and for some reason when trying both of them on array of 1M elements the Mergesort sorted it in 0.3s and Quicksort took 1.3s.

I've heard that generally quicksort is faster, because of its memory management, but how would one explain these results?

I am running MacBook Pro if that makes any difference. The input is a set of randomly generated integers from 0 to 127.

The codes are in Java:

MergeSort:

static void mergesort(int arr[]) {
    int n = arr.length;
    if (n < 2)
        return;
    int mid = n / 2;
    int left[] = new int[mid];
    int right[] = new int[n - mid];
    for (int i = 0; i < mid; i++)
        left[i] = arr[i];
    for (int i = mid; i < n; i++)
        right[i - mid] = arr[i];
    mergesort(left);
    mergesort(right);
    merge(arr, left, right);
}

public static void merge(int arr[], int left[], int right[]) {
    int nL = left.length;
    int nR = right.length;
    int i = 0, j = 0, k = 0;
    while (i < nL && j < nR) {
        if (left[i] <= right[j]) {
            arr[k] = left[i];
            i++;
        } else {
            arr[k] = right[j];
            j++;
        }
        k++;
    }
    while (i < nL) {
        arr[k] = left[i];
        i++;
        k++;
    }
    while (j < nR) {
        arr[k] = right[j];
        j++;
        k++;
    }
}

Quicksort:

public static void quickSort(int[] arr, int start, int end) {
    int partition = partition(arr, start, end);

    if (partition - 1 > start) {
        quickSort(arr, start, partition - 1);
    }
    if (partition + 1 < end) {
        quickSort(arr, partition + 1, end);
    }
}

public static int partition(int[] arr, int start, int end) {
    int pivot = arr[end];

    for (int i = start; i < end; i++) {
        if (arr[i] < pivot) {
            int temp = arr[start];
            arr[start] = arr[i];
            arr[i] = temp;
            start++;
        }
    }

    int temp = arr[start];
    arr[start] = pivot;
    arr[end] = temp;

    return start;
}

解决方案

Your implementations are a bit simplistic:

  • mergesort allocates 2 new arrays at each recursive call, which is expensive, yet some JVMs are surprisingly efficient at optimising such coding patterns.
  • quickSort uses a poor choice of pivot, the last element of the subarray, which gives quadratic time for sorted subarrays, including those with identical elements.

The data set, an array with pseudo-random numbers in a small range 0..127, causes the shortcoming of the quickSort implementation to perform much worse than the inefficiency of the mergesort version. Increasing the dataset size should make this even more obvious and might even cause a stack overflow because of too many recursive calls. Data sets with common patterns such as identical values, increasing or decreasing sets and combinations of such sequences would cause catastrophic performance of the quickSort implementation.

Here is a slightly modified version with less pathological choice of pivot (the element at 3/4 of the array) and a loop to detect duplicates of the pivot value to improve efficiency on datasets with repeated values. It performs much better (100x) on my standard sorting benchmark with arrays of just 40k elements, but still much slower (8x) than radixsort:

public static void quickSort(int[] arr, int start, int end) {
    int p1 = partition(arr, start, end);
    int p2 = p1;

    /* skip elements identical to the pivot */
    while (++p2 <= end && arr[p2] == arr[p1])
        continue;

    if (p1 - 1 > start) {
        quickSort(arr, start, p1 - 1);
    }
    if (p2 < end) {
        quickSort(arr, p2, end);
    }
}

public static int partition(int[] arr, int start, int end) {
    /* choose pivot at 3/4 or the array */
    int i = end - ((end - start + 1) >> 2);
    int pivot = arr[i];
    arr[i] = arr[end];
    arr[end] = pivot;

    for (i = start; i < end; i++) {
        if (arr[i] < pivot) {
            int temp = arr[start];
            arr[start] = arr[i];
            arr[i] = temp;
            start++;
        }
    }

    int temp = arr[start];
    arr[start] = pivot;
    arr[end] = temp;

    return start;
}


For the OP's dataset, assuming decent randomness of the distribution, scanning for duplicates is responsible for the performance improvement. Choosing a different pivot, be it first, last, middle, 3/4 or 2/3 or even median of 3 has almost no impact, as expected.

Further testing on other non random distributions shows catastrophic performance for this quickSort implementation due to the choice of pivot. On my benchmark, much improved performance is obtained by choosing for pivot the element at 3/4 or 2/3 of the array (300x improvement for 50k samples, 40% faster than standard merge sort and comparable time to radix_sort).

  • Mergesort has the distinct advantage of being stable and predictable for all distributions, but it requires extra memory between 50% and 100% of the size of the dataset.
  • Carefully implemented Quicksort is somewhat faster in many cases and performs in place, requiring only log(N) stack space for recursion. Yet it is not stable and tailor made distributions will exhibit catastrophic performance, possibly crashing.
  • Radixsort is only appropriate for specific kinds of data such as integers and fixed length strings. It also requires extra memory.
  • Countingsort would be the most efficient for the OP's dataset as it only needs an array of 128 integers to count the number of occurrences of the different values, known to be in the range 0..127. It will execute in linear time for any distribution.

这篇关于对快速排序和合并排序进行基准测试,可以使合并排序更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆