了解归并排序优化:避免副本 [英] Understanding merge sort optimization: avoiding copies
问题描述
下面我有在算法书合并排序程序,它被提及的主要问题是,合并两个有序列表需要线性额外的内存,并复制到临时数组花额外的工作和背面,整个算法,具有大幅放缓的那种效果。这种复制,可避免在递归的交替水平明智切换a和tmp_array的角色。
我的问题是什么意思作家抄袭,可避免开关明智的和tmp_array角色在递归替代水平,以及如何有可能在以下code?请求显示一个例子,我们如何能做到这一点?
无效归并(INPUT_TYPE一个[],无符号整数N){
INPUT_TYPE * tmp_array;
tmp_array =(INPUT_TYPE *)malloc的(第(n + 1)*的sizeof(INPUT_TYPE));
m_sort(一个,tmp_array,1,n)的;
免费(tmp_array);
}
无效m_sort(INPUT_TYPE一个[],INPUT_TYPE tmp_array [],诠释左,右诠释){
INT中心;
如果(左<右){
中心=(左+右)/ 2;
m_sort(一,tmp_array,左,中);
m_sort(一,tmp_array,中心+ 1,右);
合并(A,tmp_array,左,+ 1,右);
}
}
无效合并(INPUT_TYPE一个[],INPUT_TYPE tmp_array [],INT l_pos,诠释r_pos,诠释right_end){
INT I,left_end,num_elements,tmp_pos;
left_end = r_pos - 1;
tmp_pos = l_pos;
num_elements = right_end - l_pos + 1;
/ *主循环* /
而((1_pos&其中; = left_end)及及(r_pos&其中; = right_end))
如果(A [1_pos] LT = A [r_pos])
tmp_array [tmp_pos ++] = A [l_pos ++];
其他
tmp_array [tmp_pos ++] = A [r_pos ++];
而(l_pos< = left_end)的前半/ *副本休息* /
tmp_array [tmp_pos ++] = A [l_pos ++];
而(r_pos< = right_end)下半年/ *副本休息* /
tmp_array [tmp_pos ++] = A [r_pos ++];
/ *复制tmp_array回* /
对于(i = 1; I< = num_elements;我++,right_end--)
一个[right_end] = tmp_array [right_end]
}
我会认为,没有看这个code,它正在执行通过声明相同的内存大小为连续的块归并排序原来的。
所以常合并排序是这样的:
- 在拆分阵列中的一半
- 类半阵列通过递归对他们调用归并
- 合并半阵列回
我猜想它是递归的,所以没有副本将我们排序的子阵列尺寸2.现在发生了什么才做什么?
_意味着它是我们的内存可用,但我们不关心它的数据
原版的:
8 5 2 3 1 7 4 6
_ _ _ _ _ _ _ _
开始递归调用:
递归调用1:
(8 5 2 3)(1 7 4 6)
_ _ _ _ _ _ _ _
递归调用2:
((8 5)(2 3))((1 7)(4 6))
_ _ _ _ _ _ _ _
递归调用3:
(((8)(5))((2)(3)))(((1)(7))((4)(6)))
_ _ _ _ _ _ _ _
递归调用与合并解决,PLUS复印(慢的方式):
合并呼叫3,使用临时空间:
(((8)(5))((2)(3)))(((1)(7))((4)(6))) - \地执行合并
((5 8)(2 3))((1 7)(4 6))≤; - /操作
必要:拷回:
((5 8)(2 3))((1 7)(4 6))< - \副本,
_ _ _ _ _ _ _ _ - /忽略老
合并呼叫2,使用临时空间:
((5 8)(2 3))((1 7)(4 6)) - \执行合并
(2 3 5 8)(1 4 6 7)< - /操作
必要:拷回:
(2 3 5 8)(1 4 6 7)< - \副本,
_ _ _ _ _ _ _ _ - /忽略老
合并呼叫1,使用临时空间:
(2 3 5 8)(1 4 6 7) - \执行合并
1 2 3 4 5 6 7 8示 - /操作
必要:拷回:
1 2 3 4 5 6 7 8示 - \副本,
_ _ _ _ _ _ _ _ - /忽略老
什么是笔者提示 递归调用与合并解决,而不复制(更快的方法):
合并呼叫3,使用临时空间:
(((8)(5))((2)(3)))(((1)(7))((4)(6))) - \地执行合并
((5 8)(2 3))((1 7)(4 6))≤; - /操作
合并呼叫2,使用旧的阵列作为临时空间:
(2 3 5 8)(1 4 6 7)< - \执行合并
((5 8)(2 3))((1 7)(4 6)) - /操作(向后)
合并呼叫1,使用临时空间:
(2 3 5 8)(1 4 6 7) - \执行合并
1 2 3 4 5 6 7 8示 - /操作
你去那里:你不需要做副本,只要你执行锁步合并排序树的每一个级别,如上图所示。
。您可能有奇偶校验的一个小问题,也正如上面展示。也就是说,你的结果可能会在你的 temp_array
。要么你有三个选择来处理这样的:
- 返回
temp_array
作为答案,并释放旧的内存(如果你的应用是细跟的) - 执行单个阵列复制操作,复制
temp_array
返回到原来的数组 - 让自己消耗只有两次为-的内存,并进行合并,从
temp_array1
单周期temp_array2
然后再返回到original_array
,然后释放temp_array2
。奇偶问题应该得到解决。
I have below merge sort program in algorithms book, it is mentioned that The main problem is that merging two sorted lists requires linear extra memory, and the additional work spent copying to the temporary array and back, throughout the algorithm, has the effect of slowing down the sort considerably. This copying can be avoided by judiciously switching the roles of "a" and "tmp_array" at alternate levels of the recursion.
My question is what does author mean "copying can be avoided by judiciously switching the roles of a and tmp_array at alternate levels of the recursion" and how it is possible in following code? Request to show an example how we can achieve this?
void mergesort( input_type a[], unsigned int n ) {
input_type *tmp_array;
tmp_array = (input_type *) malloc( (n+1) * sizeof (input_type) );
m_sort( a, tmp_array, 1, n );
free( tmp_array );
}
void m_sort( input_type a[], input_type tmp_array[ ], int left, int right ) {
int center;
if( left < right ) {
center = (left + right) / 2;
m_sort( a, tmp_array, left, center );
m_sort( a, tmp_array, center+1, right );
merge( a, tmp_array, left, center+1, right );
}
}
void merge( input_type a[ ], input_type tmp_array[ ], int l_pos, int r_pos, int right_end ) {
int i, left_end, num_elements, tmp_pos;
left_end = r_pos - 1;
tmp_pos = l_pos;
num_elements = right_end - l_pos + 1;
/* main loop */
while( ( 1_pos <= left_end ) && ( r_pos <= right_end ) )
if( a[1_pos] <= a[r_pos] )
tmp_array[tmp_pos++] = a[l_pos++];
else
tmp_array[tmp_pos++] = a[r_pos++];
while( l_pos <= left_end ) /* copy rest of first half */
tmp_array[tmp_pos++] = a[l_pos++];
while( r_pos <= right_end ) /* copy rest of second half */
tmp_array[tmp_pos++] = a[r_pos++];
/* copy tmp_array back */
for(i=1; i <= num_elements; i++, right_end-- )
a[right_end] = tmp_array[right_end];
}
I'm going to assume that, without looking at this code, it is performing merge sort by declaring a contiguous block of memory the same size as the original.
So normally merge sort is like this:
- split array in half
- sort half-arrays by recursively invoking MergeSort on them
- merge half-arrays back
I'm assuming it's recursive, so no copies will be done before we're sorting sub-arrays of size 2. Now what happens?
_ means it is memory we have available, but we don't care about the data in it
original:
8 5 2 3 1 7 4 6
_ _ _ _ _ _ _ _
Begin recursive calls:
recursive call 1:
(8 5 2 3) (1 7 4 6)
_ _ _ _ _ _ _ _
recursive call 2:
((8 5) (2 3)) ((1 7) (4 6))
_ _ _ _ _ _ _ _
recursive call 3:
(((8) (5))((2) (3)))(((1) (7))((4) (6)))
_ _ _ _ _ _ _ _
Recursive calls resolving with merging, PLUS COPYING (slower method):
merge for call 3, using temp space:
(((8) (5))((2) (3)))(((1) (7))((4) (6))) --\ perform merge
(( 5 8 )( 2 3 ))(( 1 7 )( 4 6 )) <--/ operation
UNNECESSARY: copy back:
(( 5 8 )( 2 3 ))(( 1 7 )( 4 6 )) <--\ copy and
_ _ _ _ _ _ _ _ --/ ignore old
merge for call 2, using temp space:
(( 5 8 )( 2 3 ))(( 1 7 )( 4 6 )) --\ perform merge
( 2 3 5 8 )( 1 4 6 7 ) <--/ operation
UNNECESSARY: copy back:
( 2 3 5 8 )( 1 4 6 7 ) <--\ copy and
_ _ _ _ _ _ _ _ --/ ignore old
merge for call 1, using temp space:
( 2 3 5 8 )( 1 4 6 7 ) --\ perform merge
1 2 3 4 5 6 7 8 <--/ operation
UNNECESSARY: copy back:
1 2 3 4 5 6 7 8 <--\ copy and
_ _ _ _ _ _ _ _ --/ ignore old
What the author is suggesting Recursive calls resolving with merging, WITHOUT COPYING (faster method):
merge for call 3, using temp space:
(((8) (5))((2) (3)))(((1) (7))((4) (6))) --\ perform merge
(( 5 8 )( 2 3 ))(( 1 7 )( 4 6 )) <--/ operation
merge for call 2, using old array as temp space:
( 2 3 5 8 )( 1 4 6 7 ) <--\ perform merge
(( 5 8 )( 2 3 ))(( 1 7 )( 4 6 )) --/ operation (backwards)
merge for call 1, using temp space:
( 2 3 5 8 )( 1 4 6 7 ) --\ perform merge
1 2 3 4 5 6 7 8 <--/ operation
There you go: you don't need to do copies as long as you perform each "level" of the merge-sort tree in lock-step, as shown above.
You may have a minor issue of parity, also as demonstrated above. That is, your result may be in your temp_array
. You either have three options for dealing with this:
- returning the
temp_array
as the answer, and release the old memory (if your application is fine with that) - perform a single array copy operation, to copy
temp_array
back into your original array - allow yourself to consume a mere twice-as-much memory, and perform a single cycle of merges from
temp_array1
totemp_array2
then back tooriginal_array
, then releasetemp_array2
. The parity issue should be resolved.
这篇关于了解归并排序优化:避免副本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!