C:元素数量不均匀的数组的合并排序 [英] C: Merge-Sort of an array with uneven number of elements

查看:62
本文介绍了C:元素数量不均匀的数组的合并排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在为我的过程编程类进行分配,在该类中,我们提供的合并排序程序无法完全正常运行.它对偶数个整数的数组执行合并排序,但抛出奇数个整数的分段错误.

我了解排序的工作方式,并且由于奇数导致细分错误(由于数组被某种程度地过度填充)而引发了细分错误.我也知道解决方案将涉及测试原始数组是偶数还是奇数,然后根据此值将值分别传递给合并函数.尽管我对该程序了解甚多,但为使它正常工作,我已经将自己的头撞墙了好几个星期了,我希望有人能给我一些建议.

在发布此内容之前,我已经做了大量的工作来寻找答案,但是所有其他示例都涉及带有结构的合并排序程序,这超出了我到目前为止所学的知识.您将在下面发布的代码中看到.此外,整个程序还涉及其他一些文件,但是我只包含了mergesort.c文件和merge.c文件,正如我的教授所保证的那样,这是唯一需要进行任何更改的地方. main文件可以正常工作,并且仅负责填充数组并调用mergesort函数.如果需要其他文件,请告诉我,我将其发布.我没有的唯一原因是因为我们使用的是Linux Shell,而且我还没有找到一种实用的方法来将代码从Shell复制并粘贴到自己的操作系统中,并且需要一段时间才能将其写出. /p>

预先感谢您可以提供的任何指针.这是代码.

mergesort.c

#include <"mergesort.h">

void mergesort(int key[], int n) //key is the array, n is the size of key
{
    int j, k, m, *w;

    w = calloc(n, sizeof(int));
    assert(w != NULL);

    for (k = 1; k < n; k *= 2) {
        for (j = 0; j < n - k; j += 2 * k) {
            merge(key + j, key + j + k, w + j, k, k);
        }
        for (j = 0; j < n; ++j) {
            key[j] = w[j];
        }   
    }
    free(w);
}

merge.c

#include "mergesort.h"

void merge(int a[], int b[], int c[], int m, int n) {
    int i = 0, j = 0, k = 0;

    while (i < m && j < n) {
        if (a[i] < b[j]) {
            c[k++] = a[i++];
        } else {
            c[k++] = b[j++];
        }   
    }

    while (i < m) {
        c[k++] = a[i++];
    }
    while (j < n) {
        c[k++] = b[j++];
    }   
}

解决方案

您的代码有一些问题:

  • include预处理器指令不正确,请使用#include "mergesort.h"#include <mergesort.h>.

  • 您必须正确计算传递给merge()的数组的大小,以使它的读数不会超出最后一个块的末尾.按照目前的编码,n必须是2的幂,以避免未定义的行为.

为您的目的,这是mergesort.c的更正版本:

#include "mergesort.h"

void mergesort(int key[], int n) {
    // key is the array, n is the number of elements
    int i, j, k, m;
    int *w;

    // allocate the working array
    w = calloc(n, sizeof(int));
    // abort the program on allocation failure
    assert(w != NULL);

    // for pairs of chunks of increasing sizes
    for (k = 1; k < n; k *= 2) {
        // as long as there are enough elements for a pair
        for (j = 0; j + k < n; j = j + k + m) {
            // compute the size of the second chunk: default to k
            m = k;
            if (j + k + m > n) {
                // chunk is the last one, size may be smaller than k
                m = n - j - k;
            }
            // merge adjacent chunks into the working array
            merge(key + j, key + j + k, w + j, k, m);
            // copy the resulting sorted list back to the key array
            for (i = 0; i < k + m; i++) {
                key[j + i] = w[j + i];
            }
        }
    }
    free(w);
}

以下是关于此练习的其他说明,但是您可能不够先进,并且可能不允许更改API:

  • 使用2个不同的源文件似乎过大了. merge例程是一个辅助功能,应为static.它将由现代编译器内联扩展.

  • 数组大小应作为size_t传递在相应的指针之后(出于一致性).

  • 您应该返回一个失败代码,而不是断言分配成功,并让调用方优雅地处理该失败.

  • 您可以将工作数组的开头用于所有合并操作.这样可以提高缓存效率.

以下是所有这些更改的版本:

#include "mergesort.h"

static void merge(int a[], size_t m, int b[], size_t n, int c[]) {
    size_t i = 0, j = 0, k = 0;

    while (i < m && j < n) {
        if (a[i] < b[j]) {
            c[k++] = a[i++];
        } else {
            c[k++] = b[j++];
        }
    }
    while (i < m) {
        c[k++] = a[i++];
    }
    while (j < n) {
        c[k++] = b[j++];
    }
}

int mergesort(int key[], size_t n) { 
    // key is the array, n is the size of key
    // return 0 for success, -1 for failure with error code in errno
    size_t i, j, k, m;
    int *w;

    w = calloc(n, sizeof(int));
    if (w == NULL)
        return -1;

    for (k = 1; k < n; k *= 2) {
        for (j = 0; j + k < n; j += k + m) {
            m = k;
            if (j + k + m > n) {
                m = n - j - k;
            }
            merge(key + j, k, key + j + k, m, w + j);
            // copy the sorted chunk back to the key array
            for (i = 0; i < k + m; i++) {
                key[j + i] = w[i];
            }
        }
    }
    free(w);
    return 0;
}

您可以通过删除对函数merge()中的索引变量的几乎一半的测试来进一步改善实现:

static void merge(int a[], size_t m, int b[], size_t n, int c[]) {
    /* always called with m > 0 and n > 0 */
    for (size_t i = 0, j = 0, k = 0;;) {
        if (a[i] < b[j]) {
            c[k++] = a[i++];
            if (i == m) {
                while (j < n) {
                    c[k++] = b[j++];
                }
                break;
            }
        } else {
            c[k++] = b[j++];
            if (j == n) {
                while (i < m) {
                    c[k++] = a[i++];
                }
                break;
            }
        }
    }
}

您可以通过以下进一步的思想来改进mergesortmerge:

  • 比较a的最后一个元素和merge中的b的第一个元素,可以大大提高部分或全部排序数组的速度.

  • merge可以返回要复制回的元素数,从而删除排序情况下的所有复制.

  • 通过将左侧块复制到临时数组并合并到key数组中,可以减小临时数组的大小.

  • 合并平衡块大小而不是2的幂可以减少2数组大小的非幂的比较的总数,但是使用递归方法更容易实现.

I've been working on an assignment for my Procedural Programming class where we are provided with a merge-sort program that does not function completely. It performs a merge-sort on arrays with an even number of integers, but throws a segmentation fault with an odd number of integers.

I understand how the sorting works, and that the segmentation fault is being thrown because the odd number is causing the segmentation fault because the array is being over-filled somehow. I also understand that the solution is going to involve a test for whether or not the original array is even or odd, and then pass the values to the merge function differently depending on this. Despite what I do understand about the program, I have been banging my head against the wall for weeks trying to get this to work properly, and I'm hoping someone can give me some advice.

I've done a lot of looking around for answers before posting this, but all other examples involve merge-sort programs with structs, which is beyond what I've learned so far. You'll see in the code I post below. Also, the full program involves a few other files, but I've included just the mergesort.c file and the merge.c file which, as I've been assured by my professor, are the only places any changes need to be made. The main file works perfectly and is only responsible for filling the array and calling the mergesort function. If the other files are necessary, let me know and I'll post them. The only reason I haven't is because we are using a Linux shell, and I haven't found a practical way to copy and paste code from the shell to my own operating system, and it takes a while to write it out.

Thanks in advance for any pointers you can provide. Here is the code.

mergesort.c

#include <"mergesort.h">

void mergesort(int key[], int n) //key is the array, n is the size of key
{
    int j, k, m, *w;

    w = calloc(n, sizeof(int));
    assert(w != NULL);

    for (k = 1; k < n; k *= 2) {
        for (j = 0; j < n - k; j += 2 * k) {
            merge(key + j, key + j + k, w + j, k, k);
        }
        for (j = 0; j < n; ++j) {
            key[j] = w[j];
        }   
    }
    free(w);
}

merge.c

#include "mergesort.h"

void merge(int a[], int b[], int c[], int m, int n) {
    int i = 0, j = 0, k = 0;

    while (i < m && j < n) {
        if (a[i] < b[j]) {
            c[k++] = a[i++];
        } else {
            c[k++] = b[j++];
        }   
    }

    while (i < m) {
        c[k++] = a[i++];
    }
    while (j < n) {
        c[k++] = b[j++];
    }   
}

解决方案

Your code has some problems:

  • The include preprocessor directive is incorrect, either use #include "mergesort.h" or #include <mergesort.h>.

  • You must compute the size of the arrays passed to merge() correctly so it does not read beyond the end of the last chunk. As currently coded, n must be a power of 2 to avoid undefined behavior.

Here is a corrected version of mergesort.c for your purpose:

#include "mergesort.h"

void mergesort(int key[], int n) {
    // key is the array, n is the number of elements
    int i, j, k, m;
    int *w;

    // allocate the working array
    w = calloc(n, sizeof(int));
    // abort the program on allocation failure
    assert(w != NULL);

    // for pairs of chunks of increasing sizes
    for (k = 1; k < n; k *= 2) {
        // as long as there are enough elements for a pair
        for (j = 0; j + k < n; j = j + k + m) {
            // compute the size of the second chunk: default to k
            m = k;
            if (j + k + m > n) {
                // chunk is the last one, size may be smaller than k
                m = n - j - k;
            }
            // merge adjacent chunks into the working array
            merge(key + j, key + j + k, w + j, k, m);
            // copy the resulting sorted list back to the key array
            for (i = 0; i < k + m; i++) {
                key[j + i] = w[j + i];
            }
        }
    }
    free(w);
}

Here are some additional remarks about this exercise, but you might not be advanced enough and changing the API is probably not allowed:

  • Using 2 different source files seems overkill. The merge routine is an auxiliary function that deserves to be static. It will be expanded inline by modern compilers.

  • Array sizes should be passed as size_t just after the corresponding pointer (for consistency).

  • Instead of asserting the allocation success, you should return a failure code and let the caller handler the failure gracefully.

  • You can use the start of the working array for all merge operations. This improves cache efficiency.

Here is a version with all these changes:

#include "mergesort.h"

static void merge(int a[], size_t m, int b[], size_t n, int c[]) {
    size_t i = 0, j = 0, k = 0;

    while (i < m && j < n) {
        if (a[i] < b[j]) {
            c[k++] = a[i++];
        } else {
            c[k++] = b[j++];
        }
    }
    while (i < m) {
        c[k++] = a[i++];
    }
    while (j < n) {
        c[k++] = b[j++];
    }
}

int mergesort(int key[], size_t n) { 
    // key is the array, n is the size of key
    // return 0 for success, -1 for failure with error code in errno
    size_t i, j, k, m;
    int *w;

    w = calloc(n, sizeof(int));
    if (w == NULL)
        return -1;

    for (k = 1; k < n; k *= 2) {
        for (j = 0; j + k < n; j += k + m) {
            m = k;
            if (j + k + m > n) {
                m = n - j - k;
            }
            merge(key + j, k, key + j + k, m, w + j);
            // copy the sorted chunk back to the key array
            for (i = 0; i < k + m; i++) {
                key[j + i] = w[i];
            }
        }
    }
    free(w);
    return 0;
}

You can further improve the implementation by removing almost half the tests on the index variables in function merge():

static void merge(int a[], size_t m, int b[], size_t n, int c[]) {
    /* always called with m > 0 and n > 0 */
    for (size_t i = 0, j = 0, k = 0;;) {
        if (a[i] < b[j]) {
            c[k++] = a[i++];
            if (i == m) {
                while (j < n) {
                    c[k++] = b[j++];
                }
                break;
            }
        } else {
            c[k++] = b[j++];
            if (j == n) {
                while (i < m) {
                    c[k++] = a[i++];
                }
                break;
            }
        }
    }
}

You can improve mergesort and merge with these further ideas:

  • comparing the last element of a and the first element of b in merge allows vast speed improvements on partially or totally sorted arrays.

  • merge could return the number of elements to copy back, removing all copying in the sorted case.

  • by copying the left chunk to the temporary array and merging into the key array, you can reduce the size of the temporary array.

  • merging balanced chunk sizes instead of powers of 2 reduces the total number of comparisons for non power of 2 array sizes, but it is easier to implement with a recursive approach.

这篇关于C:元素数量不均匀的数组的合并排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆