性能内置.NET收集分拣机 [英] Performance of built-in .NET collection sorters

查看:121
本文介绍了性能内置.NET收集分拣机的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有问如何排序列表的问题。有从基本List.Sort(给定),以List.OrderBy几种方法()。最可笑的是滚你自己 - SelectionSort。我立刻投了下来,但它让我思考;不会的LINQ的排序依据(),适用于清单,做同样的事情? myList.OrderBy(X => x.Property).ToList()会产生一个迭代器,基本上认为在什么剩下集合的投影的最小值和产量返回。当整个列表会,这是一个选择排序。

There was a question asked about how to sort a List. There were several methods given from the basic List.Sort() to List.OrderBy(). The most laughable was a roll-your-own-SelectionSort. I promptly voted that down, but it made me think; wouldn't Linq's OrderBy(), applied to a list, do the same thing? myList.OrderBy(x=>x.Property).ToList() would produce an iterator that basically finds the minimum value of the projection in what's left of the collection and yield returns it. When going through the entire list, that's a selection sort.

这让我想起;什么样的算法也内置分拣机的列表,SortedLists,可枚举等使用,并推而广之,应避免大集合任何人?一个排序列表,因为它保持排序的关键,可能会使用在每增加一个单通InsertionSort;找到第一个指标,其值大于新的,在它前面插入。列表和数组可能有效地归并自己pretty的,但我不知道后面的排序实际算法()。我们已经讨论了排序依据。

Which made me think; what algorithms do the built-in sorters for Lists, SortedLists, Enumerables, etc. use, and by extension, should any of them be avoided for large collections? A SortedList, as it stays sorted by key, would probably use a single-pass InsertionSort on each add; find the first index with a value greater than the new one, and insert before it. Lists and Arrays probably MergeSort themselves pretty efficiently, but I don't know the actual algorithm behind Sort(). We've discussed OrderBy.

似乎我所知道的上面,表明List.Sort()或的Array.Sort()是已知大小的列表的最佳选择,并使用LINQ排序内存中的列表或阵列应劝阻。对于流,是不是真的有那么任何其他方式排序依据()的枚举;性能损失的事实,你可以保持,而不必拥有这一切排序前的数据流减轻。

What I know above would seem to indicate that List.Sort() or Array.Sort() are the best options for a list of known size, and using Linq to sort an in-memory list or array should be discouraged. For a stream, there really isn't any other way then to OrderBy() the enumerable; the performance loss is mitigated by the fact that you can keep the data as a stream instead of having to have it all before sorting it.

编辑:

普遍的共识是,排序()速度更快给出一个具体的实现列表或数组中。排序依据是合理的,但慢,因为它增加了提取传入枚举数组的O(N)的复杂性。排序列表初始化结束,因为一个什么样的引擎盖下是O(N ^ 2)。这个故事告诉我们,使用List.Sort(),而不是List.OrderBy(),当你有一个实际的列表。

The general consensus is that Sort() is faster given a concrete implementation of a List or Array. OrderBy is reasonable but slower because it adds O(N) complexity of extracting an array from the passed enumerable. SortedList initialization ends up being O(N^2) because of what's under the hood. Moral of the story, use List.Sort() instead of List.OrderBy() when you have an actual List.

推荐答案

Enumerable.OrderBy()吸食了IEnumerable<>到一个数组中,并使用快速排序。 O(n)的存储需求。它是由在System.Core.dll的一个内部类来完成, EnumerableSort< TElement> .QuickSort()。存储成本使其失去竞争力与简单排序列表,如果你有一个,因为名单,其中,就地>排序。 LINQ的优化往往通过检查的IEnumerable的真实能力的就是运营商。因为名单,不会在这里工作的LT;>排序是破坏性的。

Enumerable.OrderBy() slurps the IEnumerable<> into an array and uses quick sort. O(n) storage requirements. It's done by an internal class in System.Core.dll, EnumerableSort<TElement>.QuickSort(). The storage cost makes it uncompetitive with simply sorting the list, if you have one, since List<> sorts in-place. Linq often optimizes by checking the true capabilities of the IEnumerable with the is operator. Won't work here since List<>.Sort is destructive.

列表与LT;>排序和的Array.Sort使用就地快速排序

List<>.Sort and Array.Sort use in-place quick sort.

排序列表&LT;>为O(n)的复杂性插入,雄踞Ò找到插入点(的log(n))的复杂性。所以把ñ未分类的物品进入它的成本为O(n ^ 2)。 SortedDictionary&LT;>采用了红黑树,给人插入为O(log(n))的复杂性。因此,O(n日志(N))的摊销快速排序,以填补它,一样的。

SortedList<> has O(n) complexity for an insertion, dominating the O(log(n)) complexity of finding the insertion point. So putting N unsorted items into it will cost O(n^2). SortedDictionary<> uses a red-black tree, giving insert O(log(n)) complexity. Thus O(nlog(n)) to fill it, same as amortized quick sort.

这篇关于性能内置.NET收集分拣机的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆