合并排序在研发 [英] Merge Sort in R

查看:145
本文介绍了合并排序在研发的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是自我学习的书算法导论,由Cormen等阿利。 在他们的著作,他们使用伪code即假设数组由指针传递(参考)。这是从R(其中的对象是按值传递)不同,所以我有一些困难,努力转变自己的伪code尽可能接近,涉及递归时尤其如此。大多数时候,我一定要不同的方式实现了很多东西。

I am self studying the book "Introduction to Algorithms" by Cormen et alli. In their book, they use pseudo-code which assumes that arrays are passed by pointer (by reference). This is different from R (where objects are passed by value), so I am having some difficulties trying to translate their pseudo-code as close as possible, especially when recursion is involved. Most of the time, I have to implement things a lot differently.

例如,在合并排序算法,它们定义合并功能(我想我已经正确地翻译)和递归归并功能(其中直接翻译为R不工作)。

For example, with the Merge Sort algorithm, they define the Merge Function (which I think I have translated correctly) and the recursive MergeSort function (where direct translation to R does not work).

在伪code的合并函数是如下,其中:A是一个数组和p,q和r为索引到阵列,使得P&所述; Q<河该过程假定子数组A [号码:Q]和A [Q + 1:R]正在有序。它融合他们形成一个子数组排序代替现有的子数组A [号码:R]

The merge function in pseudo-code is as follows where: A is an array and p, q, and r are indices into the array such that p < q < r. The procedure assumes that the subarrays A[p:q] and A[q+1:r] are in sorted order. It merges them to form a single sorted subarray that replaces the current subarray A[p:r]

Merge(A, p, q, r)
n1 = q - p + 1
n2 = r - q
let L[1...n1+1] and R[1...n2+1] be new arrays
for i = 1 to n1
    L[i] = A[p+i-1]
for j = 1 to n2
    R[j] = A[q+j]
L[n1+1] = infinite
R[n2+1] = infinite
i=1
j=1
for k = p to r
    if L[i] <= R[j]
        A[j] = L[i]
        i = i + 1
    else
        A[k] = R[j]
        j = j + 1

我已经转换至R为:

Which I've translated to R as:

Merge <- function(a, p, q, r){
  n1 <- q - p + 1
  n2 <- r - q
  L <- numeric(n1+1)
  R <- numeric(n2+1)
  for(i in 1:n1){
    L[i] <- a[p+i-1]
  }
  for(j in 1:n2){
    R[j] <- a[q+j]
  }
  L[n1+1] <- Inf
  R[n2+1] <- Inf
  i=1
  j=1
  for(k in p:r){
    if(L[i] <= R[j]){
      a[k] <- L[i]
      i <- i +1
    }else{
      a[k] <- R[j]
      j <- j+1
    }
  }
  a
}

和它似乎很好地工作。

Merge(c(1,3,5, 2,4,6), 1, 3, 6)
[1] 1 2 3 4 5 6

现在归并功能在伪code定义如下:

Now the MergeSort function is defined in pseudo-code as follows:

MergeSort(A, p, r)
if p < r
   q = (p+r)/2
   MergeSort(A, p, q)
   MergeSort(A, q+1, r)
   Merge(A, p, q, r)

这假设A是由参考和每一个变化是可见的每一个递归调用,这是不正确的R.通过

This assumes that A is passed by reference and that every change is visible to every recursive call, which is not true in R.

因此​​,给出的合并功能上面的定义,你将如何贯彻落实归并在研发功能,以获得正确的结果? (如果可能,然后preferable,但不是必需的,有点类似伪code)

So, given the Merge function defined above, how you would implement the MergeSort function in R to obtain the correct results? (if possible, and preferable, but not necessary, somewhat similar to the pseudo-code)

推荐答案

试图做的是一种语言,它允许传递按引用的语言不支持用伪code直译这是一个可怕的想法。的r并不意味着工作在一个函数中的数组片。这只是不是一个适当的翻译。伪code应该是沟通的算法,你再翻译成相应的语言的精神。这里的归并的精神成R的一种可能的翻译。

Trying to do a literal translation of pseudocode that is written for a language that allows for pass-by-reference in a language that does not support it is a terrible idea. R's not meant to work on slices of an array within a function. That's just not an appropriate translation. The pseudocode is supposed to communicate the spirit of the algorithm which you then translate into the appropriate language. Here's one possible translation of the spirit of mergesort into R.

mmerge<-function(a,b) {
    r<-numeric(length(a)+length(b))
    ai<-1; bi<-1; j<-1;
    for(j in 1:length(r)) {
        if((ai<=length(a) && a[ai]<b[bi]) || bi>length(b)) {
            r[j] <- a[ai]
            ai <- ai+1
        } else {
            r[j] <- b[bi]
            bi <- bi+1          
        }
    }
    r
}
mmergesort<-function(A) {
    if(length(A)>1) {
        q <- ceiling(length(A)/2)
        a <- mmergesort(A[1:q])
        b <- mmergesort(A[(q+1):length(A)])
        mmerge(a,b)
    } else {
        A
    }
}

您可以运行它

x<-c(18, 16, 8, 7, 6, 3, 11, 9, 15, 1)
mmergesort(x)

在此版本的事情是通过引用替换为:所有的函数返回新值。另外,而不​​是传递在幻灯片指标,我们只是子集载体,并通过他们整体的功能。

In this version thing is replaced via reference: all functions return new values. Additional, rather than passing in slide indexes, we simply subset vectors and pass them whole to the functions.

当然,这个版本的性能很可能是因为一切发生在中间步骤的内存重新分配的受苦。没有太多可以做的,因为语言如何被设计在底座R上。如果你喜欢,你可以写C / C ++ code和调用通过<一href="http://cran.r-project.org/doc/manuals/r-release/R-exts.html#System-and-foreign-language-interfaces">foreign语言界面的。

Of course the performance of this version is likely to suffer because of all the memory reallocations that occur at the intermediate steps. There's not much you can do about that in base R because of how the language was designed. If you like, you can write C/C++ code and call that via the foreign language interfaces.

如果你想离开你的合并原样(而忽略了R-的方式做的事情),那么你可以做......

If you want to leave your Merge as-is (and ignore the R-way to do things), then you could do...

MergeSort<-function(A, p, r) {
    if(p < r) {
        q <- floor((p+r)/2)
        A <- MergeSort(A, p, q)
        A <- MergeSort(A, q+1, r)
        Merge(A, p, q, r)
    } else {
        A
    }
}
x <- c(18, 16, 8, 7, 6, 3, 11, 9, 15, 1)
MergeSort(x, 1, length(x))

更新:

包括标准制定线束

m1<-function() {
    x<-sample(1000, 250);
    mmergesort(x)
}

m2<-function() {
    x<-sample(1000, 250);
    MergeSort(x, 1, length(x))
}

microbenchmark(m1(), m2())

这篇关于合并排序在研发的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆