什么是最好的数据结构保持排序顺序的前n个元素? [英] What is the best data structure for keeping the top n elements in sort order?
问题描述
我正在寻找一种数据结构,它保持了顶层的元素,类似于这个问题,但是增加了维护排序顺序的要求。显然,我可以结束排序,但是可能会有一个更有效的方法来做到这一点。只有插入,永远不会删除,然后通过顶部的 n 元素进行迭代。
I am looking for a data structure which keeps the top n elements, similar to this question, but with the added requirement of maintaining sort order. Obviously I could just sort at the end but there might be a more efficient way to do it on the fly. There will only be insertions, never removals, and then an iteration through the top n elements at the end.
这个问题是语言不可知的,它将在C#中,所以使用本机.NET集合的答案是比较可取的。
This question is language agnostic but it will be in C# so an answer that uses native .NET collections is preferable.
编辑:我应该澄清一下排序顺序只有在最后才是最重要的元素被迭代过去。在插入的过程中,排序顺序是无关紧要的,只要顶部的 n 元素被保留。
I should clarify that the sort order only matters at the very end when the top n elements are iterated over. Along the way as there are insertions the sort order is irrelevant as long as the top n elements are retained.
推荐答案
这个答案与Kelly的类似,但是有一个测试的代码示例。由于N的大小小于100,我使用一个简单的插入排序,如果项目数量高于某些(非优化)值(例如20个项目),则使用二进制搜索查找进行修改。我已经包含一个示例控制台应用程序(C#)来显示它的用途。我简单地测试了它,以确保它的工作,但我没有对此进行全面的分析。这个结构是针对减少内存使用进行了优化的。
This answer is similar to Kelly's but has a tested code example. Since the size of N is small < 100, I used a simple insertion sort, modified with a binary search lookup if the number of items is above some (non-optimized) value (e.g. 20 items). I have included a sample console app (C#) to show its use. I tested it briefly to make sure it works, but I didn't do a full analysis of it at the moment. This structure is optimized for reducing memory usage.
public class TopNStructure<T> : IEnumerable<T> where T : IComparable<T>
{
private const int SizeForLinearOrBinaryInsert = 20;
private int _maxSize;
private int _currentSize;
private T[] _items;
private IComparer<T> _comparer;
/// <summary>
/// The number of items
/// </summary>
public int Count { get { return _currentSize; } }
public TopNStructure(int maxSize, IComparer<T> comparer)
{
if (maxSize <= 0)
{
throw new ArgumentOutOfRangeException("Max size must be a postive, non-zero value");
}
_maxSize = maxSize;
_currentSize = 0;
_items = new T[maxSize];
_comparer = comparer;
}
public TopNStructure(int maxSize)
: this(maxSize, Comparer<T>.Default) { }
/// <summary>
/// Adds an item to this structure
/// </summary>
/// <param name="item">The item to add</param>
/// <returns>True if the item was added, false otherwise</returns>
public bool Add(T item)
{
if (_currentSize == 0)
{
_items[0] = item;
}
else if (_currentSize == _maxSize)
{
if (_comparer.Compare(_items[_currentSize - 1], item) <= 0)
{
return false;
}
else
{
Insert(item);
return true;
}
}
else if (_currentSize == 1)
{
if (_comparer.Compare(_items[0], item) <= 0)
{
_items[1] = item;
}
else
{
_items[1] = _items[0];
_items[0] = item;
}
}
else
{
if (_comparer.Compare(_items[_currentSize - 1], item) <= 0)
{
_items[_currentSize] = item;
}
else
{
Insert(item);
}
}
_currentSize++;
return true;
}
/// <summary>
/// Insert the item into the data structure
/// </summary>
/// <param name="item">The item to insert</param>
private void Insert(T item)
{
int start = 0;
if (_currentSize >= SizeForLinearOrBinaryInsert)
{
start = Array.BinarySearch<T>(_items, 0, _currentSize, item, _comparer);
if (start < 0)
{
start = ~start;
}
ShiftAndInsert(item, start, _currentSize);
return;
}
else
{
for (int i = start; i < _currentSize; i++)
{
if (_comparer.Compare(_items[i], item) > 0)
{
ShiftAndInsert(item, i, _currentSize);
return;
}
}
_items[_currentSize] = item;
}
}
/// <summary>
///
/// </summary>
/// <param name="index"></param>
/// <param name="maxIndex"></param>
private void ShiftAndInsert(T item, int index, int maxIndex)
{
if (maxIndex >= _maxSize)
{
maxIndex = _maxSize - 1;
}
for (int i = maxIndex; i > index; i--)
{
_items[i] = _items[i - 1];
}
_items[index] = item;
}
public IEnumerator<T> GetEnumerator()
{
return ((IEnumerable<T>)_items).GetEnumerator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return _items.GetEnumerator();
}
}
static void Main(string[] args)
{
TopNStructure<double> data = new TopNStructure<double>(25);
Random rand = new Random(132151);
for (int i = 0; i < 50; i++)
{
double value = rand.NextDouble();
data.Add(value);
}
int j = 0;
foreach (double value in data)
{
Console.WriteLine("{0} {1}", j, value);
j++;
}
Console.ReadKey();
}
这篇关于什么是最好的数据结构保持排序顺序的前n个元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!