哪个容器存储唯一值? [英] What container to store unique values?

查看:82
本文介绍了哪个容器存储唯一值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到以下问题.我有一个平均每秒运行60帧的游戏.我需要将每个值存储在容器中的每一帧,并且不得重复.

I've got the following problem. I have a game which runs on average 60 frames per second. Each frame I need to store values in a container and there must be no duplicates.

它可能每帧必须存储少于100个项目,但是插入调用的数量将会更多(由于唯一性,许多拒绝被调用).仅在框架的末尾,我才需要遍历容器.因此,每帧容器约有60次迭代,但是插入次数更多.

It probably has to store less than 100 items per frame, but the number of insert-calls will be alot more (and many rejected due to it has to be unique). Only at the end of the frame do I need to traverse the container. So about 60 iterations of the container per frame, but alot more insertions.

请记住要存储的项目是简单的整数.

Keep in mind the items to store are simple integer.

我可以使用一堆容器,但是我不能确定该选择什么.性能是关键.

There are a bunch of containers I can use for this but I cannot make up my mind what to pick. Performance is the key issue for this.

我收集的一些优点/缺点:

Some pros/cons that I've gathered:

向量

  • (PRO):连续的内存,一个很大的因素.
  • (PRO):可以先保留内存,然后再保留很少的分配/取消分配
  • (CON):除了遍历容器(std :: find)每个insert()以查找唯一键之外,别无选择吗?比较起来很简单(整数),并且整个容器可能可以容纳缓存

设置

  • (PRO):简单,显然是为此目的
  • (CON):插入时间不恒定
  • (CON):每帧大量分配/取消分配
  • (CON):不连续的内存.遍历一组数百个对象意味着在内存中跳来跳去.

无序设置

  • (PRO):简单,显然是为此目的
  • (PRO):平均情况下恒定时间插入
  • (CON):看到我存储整数时,哈希运算可能比其他任何东西都要昂贵
  • (CON):每帧大量分配/取消分配
  • (CON):不连续的内存.遍历一组数百个对象意味着在内存中跳来跳去.

由于内存访问模式的缘故,我倾向于采用向量路由,即使set显然是针对此问题的.对我而言,一个尚不清楚的大问题是,遍历每个插入的向量是否比分配/取消分配(尤其是考虑必须多久执行一次)和set的内存查找的开销更大.

I'm leaning on going the vector-route because of memory access patterns, even though set is clearly meant for this issue. The big issue that is unclear to me is whether traversing the vector for each insert is more costly than the allocations/deallocations (especially considering how often this must be done) and the memory lookups of set.

我知道最终归结为对每种情况进行概要分析,但是如果只是作为先例或从理论上讲,在这种情况下最好的选择是什么?还有可能我也错过的任何利弊吗?

I know ultimately it all comes down to profiling each case, but if nothing else than as a headstart or just theoretically, what would probably be best in this scenario? Are there any pros/cons I might've missed aswell?

正如我没有提到的,容器在每一帧的末尾都被清除()

As I didnt mention, the container is cleared() at the end of each frame

推荐答案

在这里,我要大声疾呼,建议矢量路由在大小为100且存储的对象为时最有效.整数值.这样做的简单原因是set和unordered_set为每个插入分配内存,而向量不必超过一次.

I'm going to put my neck on the block here and suggest that the vector route is probably most efficient when the size is 100 and the objects being stored are integral values. The simple reason for this is that set and unordered_set allocate memory for each insert whereas the vector needn't more than once.

通过保持向量有序,可以显着提高搜索性能,因为所有搜索都可以是二进制搜索,因此可以在log2N时间内完成.

You can increase search performance dramatically by keeping the vector ordered, since then all searches can be binary searches and therefore complete in log2N time.

不利的一面是,由于内存移动,插入将花费很少的时间,但听起来似乎比插入要多得多,并且移动(平均)50个连续的存储字几乎是瞬时操作.

The downside is that the inserts will take a tiny fraction longer due to the memory moves, but it sounds as if there will be many more searches than inserts, and moving (average) 50 contiguous memory words is an almost instantaneous operation.

最后一个字: 现在编写正确的逻辑.用户抱怨时担心性能.

Final word: Write the correct logic now. Worry about performance when the users are complaining.

因为我束手无策,所以这里是一个相当完整的实现:

Because I couldn't help myself, here's a reasonably complete implementation:

template<typename T>
struct vector_set
{
    using vec_type = std::vector<T>;
    using const_iterator = typename vec_type::const_iterator;
    using iterator = typename vec_type::iterator;

    vector_set(size_t max_size)
    : _max_size { max_size }
    {
        _v.reserve(_max_size);
    }

    /// @returns: pair of iterator, bool
    /// If the value has been inserted, the bool will be true
    /// the iterator will point to the value, or end if it wasn't
    /// inserted due to space exhaustion
    auto insert(const T& elem)
    -> std::pair<iterator, bool>
    {
        if (_v.size() < _max_size) {
            auto it = std::lower_bound(_v.begin(), _v.end(), elem);
            if (_v.end() == it || *it != elem) {
                return make_pair(_v.insert(it, elem), true);
            }
            return make_pair(it, false);
        }
        else {
            return make_pair(_v.end(), false);
        }
    }

    auto find(const T& elem) const
    -> const_iterator
    {
        auto vend = _v.end();
        auto it = std::lower_bound(_v.begin(), vend, elem);
        if (it != vend && *it != elem)
            it = vend;
        return it;
    }

    bool contains(const T& elem) const {
        return find(elem) != _v.end();
    }

    const_iterator begin() const {
        return _v.begin();
    }

    const_iterator end() const {
        return _v.end();
    }


private:
    vec_type _v;
    size_t _max_size;
};

using namespace std;


BOOST_AUTO_TEST_CASE(play_unique_vector)
{
    vector_set<int> v(100);

    for (size_t i = 0 ; i < 1000000 ; ++i) {
        v.insert(int(random() % 200));
    }

    cout << "unique integers:" << endl;
    copy(begin(v), end(v), ostream_iterator<int>(cout, ","));
    cout << endl;

    cout << "contains 100: " << v.contains(100) << endl;
    cout << "contains 101: " << v.contains(101) << endl;
    cout << "contains 102: " << v.contains(102) << endl;
    cout << "contains 103: " << v.contains(103) << endl;
}

这篇关于哪个容器存储唯一值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆