最好的算法,用于从列表中删除重复值 [英] Best Algorithm for Removing Duplicate Values from a List
问题描述
什么是从列表中删除重复值的最佳算法? 我试过这样的:
What is best algorithm for removing duplicate values from a list ? I've tried this:
for (int i = 0; i < AuthorCounter-1; i++)
{
for (int j = 0; j < AuthorCounter-1; j++)
{
if (i != j)
{
if (AuthorGroupNode.Nodes[i].Text == AuthorGroupNode.Nodes[j].Text)
{
AuthorGroupNode.Nodes[j].Remove();
AuthorCounter--;
}
}
}
}
下面, AuthorGroupNodes
是节点的列表。它做正确的事情在一定程度上,但并不完美。任何一个有更好的解决办法???
Here, AuthorGroupNodes
is a list on nodes. It did things right to some extent, but not perfect. Any one have better solution ???
推荐答案
您当前的算法是O(N平方),这将表现相当不佳的大名单。
Your current algorithm is O(N-squared), which will perform quite poorly for a large list.
如果空间不是一个问题,你可以保持一个的HashSet&LT; INT&GT;
节点的散列。一旦遍历该列表。如果该节点的哈希是在HashSet的,你就知道这是一个重复的节点。跳过。如果散列不处于HashSet中,这个节点添加到一个新的列表,并且该节点的哈希添加到HashSet的
If space is not an issue, you could keep a HashSet<int>
of hashes of nodes. Traverse the list once. If the hash of the node is in the HashSet, you know this is a duplicate node. Skip it. If the hash is not in the HashSet, add this node to a new list, and add the hash of the node to the HashSet.
这将执行O(N),并要求内存的原始列表,该列表减去重复的副本,并为HashSet的。该算法是无损
This will perform O(N), and requires memory for the original list, for a copy of the list less any duplicates, and for the HashSet. The algorithm is non-destructive.
如果您可以使用LINQ,根本就
If you can use Linq, simply do
var distinctList = originalList.Distinct().ToList();
更新
发现多数民众赞成pretty的多少究竟如何乔恩斯基特重新实现显着。
Discovered that's pretty much exactly how Jon Skeet re-implemented Distinct.
public static IEnumerable<TSource> Distinct<TSource>(
this IEnumerable<TSource> source)
{
return source.Distinct(EqualityComparer<TSource>.Default);
}
public static IEnumerable<TSource> Distinct<TSource>(
this IEnumerable<TSource> source,
IEqualityComparer<TSource> comparer)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
return DistinctImpl(source, comparer ?? EqualityComparer<TSource>.Default);
}
private static IEnumerable<TSource> DistinctImpl<TSource>(
IEnumerable<TSource> source,
IEqualityComparer<TSource> comparer)
{
HashSet<TSource> seenElements = new HashSet<TSource>(comparer);
foreach (TSource item in source)
{
if (seenElements.Add(item))
{
yield return item;
}
}
}
<一个href="http://msmvps.com/blogs/jon_skeet/archive/2010/12/30/reimplementing-linq-to-objects-part-14-distinct.aspx" rel="nofollow">http://msmvps.com/blogs/jon_skeet/archive/2010/12/30/reimplementing-linq-to-objects-part-14-distinct.aspx
这篇关于最好的算法,用于从列表中删除重复值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!