就地排序 [英] Sorting in place

查看:34
本文介绍了就地排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

就地排序"是什么意思?

What is meant by to "sort in place"?

推荐答案

就地算法的想法并不是排序所独有的,但排序可能是最重要的情况,或者至少是最广为人知的情况.这个想法是关于空间效率的 - 使用最少的 RAM、硬盘或其他存储空间.这在几十年前尤为重要,当时硬件更加有限.

The idea of an in-place algorithm isn't unique to sorting, but sorting is probably the most important case, or at least the most well-known. The idea is about space efficiency - using the minimum amount of RAM, hard disk or other storage that you can get away with. This was especially relevant going back a few decades, when hardware was much more limited.

这个想法是通过连续转换数据直到产生输出,在包含输入的相同存储空间中产生输出.这避免了使用两倍存储空间的需要 - 一个区域用于输入,一个大小相等的区域用于输出.

The idea is to produce an output in the same memory space that contains the input by successively transforming that data until the output is produced. This avoids the need to use twice the storage - one area for the input and an equal-sized area for the output.

排序是一个相当明显的例子,因为排序可以通过重复交换项目来完成——排序只是重新排列项目.交换并不是唯一的方法——例如,插入排序使用了一种稍微不同的方法相当于进行一系列交换,但速度更快.

Sorting is a fairly obvious case for this because sorting can be done by repeatedly exchanging items - sorting only re-arranges items. Exchanges aren't the only approach - the Insertion Sort, for example, uses a slightly different approach which is equivalent to doing a run of exchanges but faster.

另一个例子是矩阵转置 - 同样,这可以通过交换项目来实现.通过从最低有效数字开始并向上传播进位,也可以就地添加两个非常大的数字(结果替换输入之一).

Another example is matrix transposition - again, this can be implemented by exchanging items. Adding two very large numbers can also be done in-place (the result replacing one of the inputs) by starting at the least significant digit and propogating carries upwards.

回到排序,当你想到一堆 打孔卡 - 最好避免复制打孔卡只是为了对它们进行分类.

Getting back to sorting, the advantages to re-arranging "in place" get even more obvious when you think of stacks of punched cards - it's preferable to avoid copying punched cards just to sort them.

某些排序算法允许这种就地操作,而其他算法则不允许.

Some algorithms for sorting allow this style of in-place operation whereas others don't.

然而,所有算法都需要一些额外的工作变量存储空间.如果目标只是通过连续修改输入来产生输出,那么通过保留大量内存,使用它来生成一些辅助数据结构,然后使用它来指导这些修改来定义算法是相当容易的.您仍然通过就地"转换输入来产生输出,但您破坏了练习的全部意义 - 您没有节省空间.

However, all algorithms require some additional storage for working variables. If the goal is simply to produce the output by successively modifying the input, it's fairly easy to define algorithms that do that by reserving a huge chunk of memory, using that to produce some auxiliary data structure, then using that to guide those modifications. You're still producing the output by transforming the input "in place", but you're defeating the whole point of the exercise - you're not being space-efficient.

因此,就地定义的正常定义要求您达到某种空间效率标准.使用与输入成正比的额外空间(即 O(n) 额外空间)并仍然称您的算法为就地"是绝对不能接受的.

For that reason, the normal definition of an in-place definition requires that you achieve some standard of space efficiency. It's absolutely not acceptable to use extra space proportional to the input (that is, O(n) extra space) and still call your algorithm "in-place".

关于就地算法的维基百科页面目前声称,就地算法只能使用恒定量 - O(1) - 的额外空间.

The Wikipedia page on in-place algorithms currently claims that an in-place algorithm can only use a constant amount - O(1) - of extra space.

在计算机科学中,就地算法(或拉丁文原位算法)是一种算法,它使用具有少量恒定额外存储空间的数据结构来转换输入.

In computer science, an in-place algorithm (or in Latin in situ) is an algorithm which transforms input using a data structure with a small, constant amount of extra storage space.

计算复杂性部分中指定了一些技术细节,但是结论仍然是例如快速排序需要 O(log n) 空间(真),因此不是就地排序(我认为这是假的).

There are some technicalities specified in the In Computational Complexity section, but the conclusion is still that e.g. Quicksort requires O(log n) space (true) and therefore is not in-place (which I believe is false).

O(log n) 远小于 O(n) - 例如,16,777,216 的基数为 2 的对数是 24.

O(log n) is much smaller than O(n) - for example the base 2 log of 16,777,216 is 24.

Quicksort 和 heapsort 通常都被认为是就地排序,堆排序可以用 O(1) 额外空间来实现(我之前误解了这一点).Mergesort 更难就地实现,但异地版本对缓存非常友好——我怀疑现实世界的实现接受 O(n) 空间开销——RAM 很便宜,但内存带宽是一个主要瓶颈,因此,以缓存效率和速度换取内存通常是一笔不错的交易.

Quicksort and heapsort are both normally considered in-place, and heapsort can be implemented with O(1) extra space (I was mistaken about this earlier). Mergesort is more difficult to implement in-place, but the out-of-place version is very cache-friendly - I suspect real-world implementations accept the O(n) space overhead - RAM is cheap but memory bandwidth is a major bottleneck, so trading memory for cache-efficiency and speed is often a good deal.

[EDIT 当我写上述内容时,我假设我正在考虑对数组进行就地合并排序.链表的就地归并排序非常简单.关键的区别在于合并算法——在没有复制或重新分配的情况下合并两个链表很容易,对更大数组的两个子数组(并且没有 O(n) 辅助存储)做同样的事情 AFAIK 不是.]

[EDIT When I wrote the above, I assume I was thinking of in-place merge-sorting of an array. In-place merge-sorting of a linked list is very simple. The key difference is in the merge algorithm - doing a merge of two linked lists with no copying or reallocation is easy, doing the same with two sub-arrays of a larger array (and without O(n) auxiliary storage) AFAIK isn't.]

Quicksort 也是缓存高效的,即使是就地,但可以通过吸引其最坏情况的行为来取消其作为就地算法的资格.有一种退化情况(在非随机版本中,通常在输入已经排序时),其中运行时间是 O(n^2) 而不是预期的 O(n log n).在这种情况下,额外的空间需求也增加到 O(n).然而,对于大型数据集和一些基本的预防措施(主要是随机枢轴选择),这种最坏情况的行为变得非常不可能.

Quicksort is also cache-efficient, even in-place, but can be disqualified as an in-place algorithm by appealing to its worst-case behaviour. There is a degenerate case (in a non-randomized version, typically when the input is already sorted) where the run-time is O(n^2) rather than the expected O(n log n). In this case the extra space requirement is also increased to O(n). However, for large datasets and with some basic precautions (mainly randomized pivot selection) this worst-case behaviour becomes absurdly unlikely.

我个人的观点是 O(log n) 额外空间对于就地算法是可以接受的 - 这不是作弊,因为它不会破坏就地工作的原始点.

My personal view is that O(log n) extra space is acceptable for in-place algorithms - it's not cheating as it doesn't defeat the original point of working in-place.

然而,我的意见当然只是我的意见.

However, my opinion is of course just my opinion.

一个额外的注意 - 有时,人们会就地调用一个函数,因为它具有用于输入和输出的单个参数.不一定意味着该函数是空间高效的,结果是通过转换输入产生的,甚至参数仍然引用相同的内存区域.这种用法是不正确的(或者 prescriptivists 会声称),尽管它很常见最好注意但不要为此感到压力.

One extra note - sometimes, people will call a function in-place simply because it has a single parameter for both the input and the output. It doesn't necessarily follow that the function was space efficient, that the result was produced by transforming the input, or even that the parameter still references the same area of memory. This usage isn't correct (or so the prescriptivists will claim), though it's common enough that it's best to be aware but not get stressed about it.

这篇关于就地排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆