什么是最好的数据结构保持排序顺序的前n个元素? [英] What is the best data structure for keeping the top n elements in sort order?

查看:159
本文介绍了什么是最好的数据结构保持排序顺序的前n个元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种数据结构,它保持了顶层的元素,类似于这个问题,但是增加了维护排序顺序的要求。显然,我可以结束排序,但是可能会有一个更有效的方法来做到这一点。只有插入,永远不会删除,然后通过顶部的 n 元素进行迭代。

I am looking for a data structure which keeps the top n elements, similar to this question, but with the added requirement of maintaining sort order. Obviously I could just sort at the end but there might be a more efficient way to do it on the fly. There will only be insertions, never removals, and then an iteration through the top n elements at the end.

这个问题是语言不可知的,它将在C#中,所以使用本机.NET集合的答案是比较可取的。

This question is language agnostic but it will be in C# so an answer that uses native .NET collections is preferable.

编辑:我应该澄清一下排序顺序只有在最后才是最重要的元素被迭代过去。在插入的过程中,排序顺序是无关紧要的,只要顶部的 n 元素被保留。

I should clarify that the sort order only matters at the very end when the top n elements are iterated over. Along the way as there are insertions the sort order is irrelevant as long as the top n elements are retained.

推荐答案

这个答案与Kelly的类似,但是有一个测试的代码示例。由于N的大小小于100,我使用一个简单的插入排序,如果项目数量高于某些(非优化)值(例如20个项目),则使用二进制搜索查找进行修改。我已经包含一个示例控制台应用程序(C#)来显示它的用途。我简单地测试了它,以确保它的工作,但我没有对此进行全面的分析。这个结构是针对减少内存使用进行了优化的。

This answer is similar to Kelly's but has a tested code example. Since the size of N is small < 100, I used a simple insertion sort, modified with a binary search lookup if the number of items is above some (non-optimized) value (e.g. 20 items). I have included a sample console app (C#) to show its use. I tested it briefly to make sure it works, but I didn't do a full analysis of it at the moment. This structure is optimized for reducing memory usage.

public class TopNStructure<T> : IEnumerable<T> where T : IComparable<T>
{
    private const int SizeForLinearOrBinaryInsert = 20;

    private int _maxSize;
    private int _currentSize;
    private T[] _items;
    private IComparer<T> _comparer;

    /// <summary>
    /// The number of items
    /// </summary>
    public int Count { get { return _currentSize; } }

    public TopNStructure(int maxSize, IComparer<T> comparer)
    {
        if (maxSize <= 0)
        {
            throw new ArgumentOutOfRangeException("Max size must be a postive, non-zero value");
        }
        _maxSize = maxSize;
        _currentSize = 0;
        _items = new T[maxSize];
        _comparer = comparer;
    }

    public TopNStructure(int maxSize)
        : this(maxSize, Comparer<T>.Default) { }

    /// <summary>
    /// Adds an item to this structure
    /// </summary>
    /// <param name="item">The item to add</param>
    /// <returns>True if the item was added, false otherwise</returns>
    public bool Add(T item)
    {
        if (_currentSize == 0)
        {
            _items[0] = item;              
        }
        else if (_currentSize == _maxSize)
        {
            if (_comparer.Compare(_items[_currentSize - 1], item) <= 0)
            {
                return false;
            }
            else
            {
                Insert(item);
                return true;
            }
        }
        else if (_currentSize == 1)
        {   
            if (_comparer.Compare(_items[0], item) <= 0)
            {
                _items[1] = item;
            }
            else
            {
                _items[1] = _items[0];
                _items[0] = item;
            }               
        } 
        else 
        {
            if (_comparer.Compare(_items[_currentSize - 1], item) <= 0)
            {
                _items[_currentSize] = item;
            }
            else
            {
                Insert(item);
            }
        }
        _currentSize++;
        return true;
    }

    /// <summary>
    /// Insert the item into the data structure
    /// </summary>
    /// <param name="item">The item to insert</param>
    private void Insert(T item)
    {
        int start = 0;
        if (_currentSize >= SizeForLinearOrBinaryInsert)
        {
            start = Array.BinarySearch<T>(_items, 0, _currentSize, item, _comparer);
            if (start < 0)
            {
                start = ~start;
            }
            ShiftAndInsert(item, start, _currentSize);                
            return;
        }
        else
        {
            for (int i = start; i < _currentSize; i++)
            {
                if (_comparer.Compare(_items[i], item) > 0)
                {
                    ShiftAndInsert(item, i, _currentSize);                      
                    return;
                }
            }
            _items[_currentSize] = item;
        }                           
    }

    /// <summary>
    /// 
    /// </summary>
    /// <param name="index"></param>
    /// <param name="maxIndex"></param>
    private void ShiftAndInsert(T item, int index, int maxIndex)
    {
        if (maxIndex >= _maxSize)
        {
            maxIndex = _maxSize - 1;
        }
        for (int i = maxIndex; i > index; i--)
        {
            _items[i] = _items[i - 1];
        }
        _items[index] = item;
    }


    public IEnumerator<T> GetEnumerator()
    {
        return ((IEnumerable<T>)_items).GetEnumerator();
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return _items.GetEnumerator();
    }
}

static void Main(string[] args)
{
    TopNStructure<double> data = new TopNStructure<double>(25);

    Random rand = new Random(132151);
    for (int i = 0; i < 50; i++)
    {
        double value = rand.NextDouble();
        data.Add(value);
    }

    int j = 0;
    foreach (double value in data)
    {
        Console.WriteLine("{0} {1}", j, value);
        j++;
    }
    Console.ReadKey();
}

这篇关于什么是最好的数据结构保持排序顺序的前n个元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆