了解归并排序优化:避免副本 [英] Understanding merge sort optimization: avoiding copies

查看:142
本文介绍了了解归并排序优化:避免副本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面我有在算法书合并排序程序,它被提及的主要问题是,合并两个有序列表需要线性额外的内存,并复制到临时数组花额外的工作和背面,整个算法,具有大幅放缓的那种效果。这种复制,可避免在递归的交替水平明智切换a和tmp_array的角色。

我的问题是什么意思作家抄袭,可避免开关明智的和tmp_array角色在递归替代水平,以及如何有可能在以下code?请求显示一个例子,我们如何能做到这一点?

 无效归并(INPUT_TYPE一​​个[],无符号​​整数N){

    INPUT_TYPE * tmp_array;
    tmp_array =(INPUT_TYPE *)malloc的(第(n + 1)*的sizeof(INPUT_TYPE));
    m_sort(一个,tmp_array,1,n)的;
    免费(tmp_array);
}

无效m_sort(INPUT_TYPE一​​个[],INPUT_TYPE tmp_array [],诠释左,右诠释){

    INT中心;
    如果(左<右){

    中心=(左+右)/ 2;
    m_sort(一,tmp_array,左,中);
    m_sort(一,tmp_array,中心+ 1,右);
    合并(A,tmp_array,左,+ 1,右);
    }
}

无效合并(INPUT_TYPE一​​个[],INPUT_TYPE tmp_array [],INT l_pos,诠释r_pos,诠释right_end){

    INT I,left_end,num_elements,tmp_pos;
    left_end = r_pos  -  1;
    tmp_pos = l_pos;
    num_elements = right_end  -  l_pos + 1;

    / *主循环* /

    而((1_pos&其中; = left_end)及及(r_pos&其中; = right_end))

        如果(A [1_pos] LT = A [r_pos])
            tmp_array [tmp_pos ++] = A [l_pos ++];
        其他
            tmp_array [tmp_pos ++] = A [r_pos ++];

    而(l_pos< = left_end)的前半/ *副本休息* /
        tmp_array [tmp_pos ++] = A [l_pos ++];

    而(r_pos< = right_end)下半年/ *副本休息* /
        tmp_array [tmp_pos ++] = A [r_pos ++];

    / *复制tmp_array回* /
    对于(i = 1; I< = num_elements;我++,right_end--)
        一个[right_end] = tmp_array [right_end]

}
 

解决方案

我会认为,没有看这个code,它正在执行通过声明相同的内存大小为连续的块归并排序原来的。

所以常合并排序是这样的:

  • 在拆分阵列中的一半
  • 类半阵列通过递归对他们调用归并
  • 合并半阵列回

我猜想它是递归的,所以没有副本将我们排序的子阵列尺寸2.现在发生了什么才做什么?

  _意味着它是我们的内存可用,但我们不关心它的数据

原版的:
   8 5 2 3 1 7 4 6
   _ _ _ _ _ _ _ _
 

开始递归调用:

 递归调用1:
  (8 5 2 3)(1 7 4 6)
   _ _ _ _ _ _ _ _

递归调用2:
 ((8 5)(2 3))((1 7)(4 6))
   _ _ _ _ _ _ _ _

递归调用3:
(((8)(5))((2)(3)))(((1)(7))((4)(6)))
   _ _ _ _ _ _ _ _
 

递归调用与合并解决,PLUS复印(慢的方式):

 合并呼叫3,使用临时空间:
(((8)(5))((2)(3)))(((1)(7))((4)(6))) -  \地执行合并
((5 8)(2 3))((1 7)(4 6))≤;  -  /操作

必要:拷回:
((5 8)(2 3))((1 7)(4 6))<  -  \副本,
   _ _ _ _ _ _ _ _  -  /忽略老

合并呼叫2,使用临时空间:
((5 8)(2 3))((1 7)(4 6)) -  \执行合并
(2 3 5 8)(1 4 6 7)<  -  /操作

必要:拷回:
(2 3 5 8)(1 4 6 7)<  -  \副本,
   _ _ _ _ _ _ _ _  -  /忽略老

合并呼叫1,使用临时空间:
(2 3 5 8)(1 4 6 7) -  \执行合并
   1 2 3 4 5 6 7 8示 -  /操作

必要:拷回:
   1 2 3 4 5 6 7 8示 -  \副本,
   _ _ _ _ _ _ _ _  -  /忽略老
 

什么是笔者提示 递归调用与合并解决,而不复制(更快的方法):

 合并呼叫3,使用临时空间:
(((8)(5))((2)(3)))(((1)(7))((4)(6))) -  \地执行合并
((5 8)(2 3))((1 7)(4 6))≤;  -  /操作

合并呼叫2,使用旧的阵列作为临时空间:
(2 3 5 8)(1 4 6 7)<  -  \执行合并
((5 8)(2 3))((1 7)(4 6)) -  /操作(向后)

合并呼叫1,使用临时空间:
(2 3 5 8)(1 4 6 7) -  \执行合并
   1 2 3 4 5 6 7 8示 -  /操作
 

你去那里:你不需要做副本,只要你执行锁步合并排序树的每一个级别,如上图所示。

您可能有奇偶校验的一个小问题,也正如上面展示。也就是说,你的结果可能会在你的 temp_array 。要么你有三个选择来处理这样的:

  • 返回 temp_array 作为答案,并释放旧的内存(如果你的应用是细跟的)
  • 执行单个阵列复制操作,复制 temp_array 返回到原来的数组
  • 让自己消耗只有两次为-的内存,并进行合并,从 temp_array1 单周期 temp_array2 然后再返回到 original_array ,然后释放 temp_array2 。奇偶问题应该得到解决。

I have below merge sort program in algorithms book, it is mentioned that The main problem is that merging two sorted lists requires linear extra memory, and the additional work spent copying to the temporary array and back, throughout the algorithm, has the effect of slowing down the sort considerably. This copying can be avoided by judiciously switching the roles of "a" and "tmp_array" at alternate levels of the recursion.

My question is what does author mean "copying can be avoided by judiciously switching the roles of a and tmp_array at alternate levels of the recursion" and how it is possible in following code? Request to show an example how we can achieve this?

void mergesort( input_type a[], unsigned int n ) {

    input_type *tmp_array;
    tmp_array = (input_type *) malloc( (n+1) * sizeof (input_type) );
    m_sort( a, tmp_array, 1, n );
    free( tmp_array );
}

void m_sort( input_type a[], input_type tmp_array[ ], int left, int right ) {

    int center;
    if( left < right ) {

    center = (left + right) / 2;
    m_sort( a, tmp_array, left, center );
    m_sort( a, tmp_array, center+1, right );
    merge( a, tmp_array, left, center+1, right );
    }
}

void merge( input_type a[ ], input_type tmp_array[ ], int l_pos, int r_pos, int right_end ) {

    int i, left_end, num_elements, tmp_pos;
    left_end = r_pos - 1;
    tmp_pos = l_pos;
    num_elements = right_end - l_pos + 1;

    /* main loop */

    while( ( 1_pos <= left_end ) && ( r_pos <= right_end ) )

        if( a[1_pos] <= a[r_pos] )
            tmp_array[tmp_pos++] = a[l_pos++];
        else
            tmp_array[tmp_pos++] = a[r_pos++];

    while( l_pos <= left_end )  /* copy rest of first half */
        tmp_array[tmp_pos++] = a[l_pos++];

    while( r_pos <= right_end ) /* copy rest of second half */
        tmp_array[tmp_pos++] = a[r_pos++];

    /* copy tmp_array back */
    for(i=1; i <= num_elements; i++, right_end-- )
        a[right_end] = tmp_array[right_end];

}

解决方案

I'm going to assume that, without looking at this code, it is performing merge sort by declaring a contiguous block of memory the same size as the original.

So normally merge sort is like this:

  • split array in half
  • sort half-arrays by recursively invoking MergeSort on them
  • merge half-arrays back

I'm assuming it's recursive, so no copies will be done before we're sorting sub-arrays of size 2. Now what happens?

_ means it is memory we have available, but we don't care about the data in it

original:
   8    5    2    3      1    7    4    6
   _    _    _    _      _    _    _    _

Begin recursive calls:

recursive call 1:
  (8    5    2    3)    (1    7    4    6)
   _    _    _    _      _    _    _    _

recursive call 2:
 ((8    5)  (2    3))  ((1    7)  (4    6))
   _    _    _    _      _    _    _    _

recursive call 3:
(((8)  (5))((2)  (3)))(((1)  (7))((4)  (6)))
   _    _    _    _      _    _    _    _

Recursive calls resolving with merging, PLUS COPYING (slower method):

merge for call 3, using temp space:
(((8)  (5))((2)  (3)))(((1)  (7))((4)  (6)))    --\ perform merge
(( 5    8 )( 2    3 ))(( 1    7 )( 4    6 ))   <--/ operation

UNNECESSARY: copy back:
(( 5    8 )( 2    3 ))(( 1    7 )( 4    6 ))   <--\ copy and
   _    _    _    _      _    _    _    _       --/ ignore old

merge for call 2, using temp space:
(( 5    8 )( 2    3 ))(( 1    7 )( 4    6 ))    --\ perform merge
(  2    3    5    8  )(  1    4    6    7  )   <--/ operation

UNNECESSARY: copy back:
(  2    3    5    8  )(  1    4    6    7  )   <--\ copy and
   _    _    _    _      _    _    _    _       --/ ignore old

merge for call 1, using temp space:
(  2    3    5    8  )(  1    4    6    7  )    --\ perform merge
   1    2    3    4      5    6    7    8      <--/ operation

UNNECESSARY: copy back:
   1    2    3    4      5    6    7    8      <--\ copy and
   _    _    _    _      _    _    _    _       --/ ignore old

What the author is suggesting Recursive calls resolving with merging, WITHOUT COPYING (faster method):

merge for call 3, using temp space:
(((8)  (5))((2)  (3)))(((1)  (7))((4)  (6)))    --\ perform merge
(( 5    8 )( 2    3 ))(( 1    7 )( 4    6 ))   <--/ operation

merge for call 2, using old array as temp space:
(  2    3    5    8  )(  1    4    6    7  )   <--\ perform merge
(( 5    8 )( 2    3 ))(( 1    7 )( 4    6 ))    --/ operation (backwards)

merge for call 1, using temp space:
(  2    3    5    8  )(  1    4    6    7  )    --\ perform merge
   1    2    3    4      5    6    7    8      <--/ operation

There you go: you don't need to do copies as long as you perform each "level" of the merge-sort tree in lock-step, as shown above.

You may have a minor issue of parity, also as demonstrated above. That is, your result may be in your temp_array. You either have three options for dealing with this:

  • returning the temp_array as the answer, and release the old memory (if your application is fine with that)
  • perform a single array copy operation, to copy temp_array back into your original array
  • allow yourself to consume a mere twice-as-much memory, and perform a single cycle of merges from temp_array1 to temp_array2 then back to original_array, then release temp_array2. The parity issue should be resolved.

这篇关于了解归并排序优化:避免副本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆