的std :: LOWER_BOUND慢性病::向量不是性病::地图::找到 [英] std::lower_bound slower for std::vector than std::map::find

查看:209
本文介绍了的std :: LOWER_BOUND慢性病::向量不是性病::地图::找到的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个类来充当围绕顺序容器(的std ::矢量 / 的std ::队列 / 的std ::列表)有一个的std ::地图,性能在使用小接口小物件的数量。编码一切都非常简单因为已经存在的算法。这code是明显的的从我的全code修整,但显示了问题。

I wrote a class to act as a wrapper around a sequential container (std::vector/std::queue/std::list) to have the interface of a std::map, for performance when using small numbers of small objects. The coding was all remarkably simple given the algorithms that already existed. This code is obviously highly trimmed from my full code, but shows the problem.

template <class key_, 
          class mapped_, 
          class traits_ = std::less<key_>,
          class undertype_ = std::vector<std::pair<key_,mapped_> >
         >
class associative
{
public:
    typedef traits_ key_compare;
    typedef key_ key_type;
    typedef mapped_ mapped_type;
    typedef std::pair<const key_type, mapped_type> value_type;
    typedef typename undertype_::allocator_type allocator_type;
    typedef typename allocator_type::template rebind<value_type>::other value_allocator_type;
    typedef typename undertype_::const_iterator const_iterator;

    class value_compare {
        key_compare pred_;
    public:
        inline value_compare(key_compare pred=key_compare()) : pred_(pred) {}
        inline bool operator()(const value_type& left, const value_type& right) const {return pred_(left.first,right.first);}
        inline bool operator()(const value_type& left, const key_type& right) const {return pred_(left.first,right);}
        inline bool operator()(const key_type& left, const value_type& right) const {return pred_(left,right.first);}
        inline bool operator()(const key_type& left, const key_type& right) const {return pred_(left,right);}
        inline key_compare key_comp( ) const {return pred_;}
    };
    class iterator  {
    public:       
        typedef typename value_allocator_type::difference_type difference_type;
        typedef typename value_allocator_type::value_type value_type;
        typedef typename value_allocator_type::reference reference;
        typedef typename value_allocator_type::pointer pointer;
        typedef std::bidirectional_iterator_tag iterator_category;
        inline iterator(const typename undertype_::iterator& rhs) : data(rhs) {}
    inline reference operator*() const { return reinterpret_cast<reference>(*data);}
        inline pointer operator->() const {return reinterpret_cast<pointer>(structure_dereference_operator(data));}
        operator const_iterator&() const {return data;}
    protected:
        typename undertype_::iterator data;
    };

    template<class input_iterator>
    inline associative(input_iterator first, input_iterator last) : internal_(first, last), comp_() 
    {if (std::is_sorted(internal_.begin(), internal_.end())==false) std::sort(internal_.begin(), internal_.end(), comp_);}

inline iterator find(const key_type& key) {
    iterator i = std::lower_bound(internal_.begin(), internal_.end(), key, comp_);
    return (comp_(key,*i) ? internal_.end() : i);
}

protected:
    undertype_ internal_;
    value_compare comp_;
};

SSCCE在 http://ideone.com/Ufn7r ,全code处的 http://ideone.com/MQr0Z (注:导致倍IdeOne是非常不稳定的,可能是由于服务器的负载,也不要清楚地显示出的结果中的问题)

SSCCE at http://ideone.com/Ufn7r, full code at http://ideone.com/MQr0Z (note: resulting times as IdeOne are highly erratic, probably due to server load, and do not clearly show the results in question)

我用的std ::从4到128个字节的字符串,和豆荚测试,从8日至2000元素MSVC10。

I tested with std::string, and PODs from 4 to 128 bytes, ranging from 8 to 2000 elements with MSVC10.

我预期为(1)从一个范围创建的物体小,(2)随机插入/擦除发送少量小物体,以及(3)查找所有对象更高的性能。令人惊奇的是,该载体是显著的速度更快,用于从范围创建所有测试,和更快的随机擦除根据大小至多约2048个字节(512 4字节的目的,或128 16字节的目的,等等)。然而,最令人震惊的是,是,的std ::矢量使用的std :: LOWER_BOUND 较慢的std ::地图::找到所有吊舱。所不同的是微乎其微的4和8字节吊舱,要不是128字节的吊舱,的std ::矢量高达36%慢!然而,对于的std ::字符串的std ::矢量 6%的平均水平。

I expected higher performance for (1) creating from a range for small objects, (2) random insertion/erasure for small numbers of small objects, and (3) lookup for all objects. Surprisingly, the vector was significantly faster for creating from a range for all tests, and faster for random erasure depending on size up to about 2048 bytes (512 4-byte objects, or 128 16-byte objects, etc). However, most shocking of all, was that the std::vector using std::lower_bound was slower than the std::map::find for all PODs. The difference was miniscule for 4 and 8-byte PODs, and but for 128-byte PODs, std::vector was up to 36% slower! However, for std::string, the std::vector was 6% faster on average.

我觉得的std :: LOWER_BOUND 在排序的std ::矢量应该跑赢的std ::地图由于更好​​的缓存位置/小内存大小,因为地图可以是不完全平衡的,或者在最坏的情况下的应该的匹配的std ::地图,但不能为我的生活想不出什么原因的std ::地图应该会更快。我唯一​​的想法是predicate莫名其妙地减缓下来,但我无法弄清楚如何。所以,问题是:这怎么可能是的std :: LOWER_BOUND 在排序的std ::矢量的std ::地图上优于(在MSVC10)?

I feel like std::lower_bound on a sorted std::vector should have outperformed std::map due to better cache locality/smaller memory size, and since the map can be imperfectly balanced, or in the worst case it should match std::map, but can't for the life of me think of any reason that std::map should be faster. My only thought is the predicate is somehow slowing it down, but I can't figure out how. So the question is: How could it be that std::lower_bound on a sorted std::vector be outperformed by a std::map (in MSVC10)?

我已经证实了的std :: LOWER_BOUND 的std ::矢量&lt;的std ::对&LT; 4BYTEPOD,4BYTEPOD&GT;&GT ; 使用较少的比较的平均比的std ::地图&LT; 4BYTEPOD,4BYTEPOD&GT; ::找到(由0〜 0.25),但我的实现仍高达慢26%。

I've confirmed that std::lower_bound on std::vector<std::pair<4BYTEPOD,4BYTEPOD>> uses fewer comparisons on average than std::map<4BYTEPOD,4BYTEPOD>::find (by 0-0.25), but my implementation is still up to 26% slower.

[POST-ANSWER-编辑]我在做一个SSCCE http://ideone.com/41iKt 的删除所有不必要的绒毛,并且清楚地表明,查找的排序向量比较慢地图,由〜15%。

[POST-ANSWER-EDIT] I made a SSCCE at http://ideone.com/41iKt that removes all unneeded fluff, and clearly shows that find on the sorted vector is slower than that of the map, by ~15%.

推荐答案

这是一个较为有趣的疙瘩!在讨论我的发现至今,让我指出,联想::找到()函数的行为不同,以的std ::地图::找到( ):如果找不到键前者返回下界,而后者返回端()。为了解决这个问题,联想::找到()需要改变成为这样的:

This is a somewhat more interesting nut to crack! Before discussing my findings so far, let me point out that the associative::find() function behaves differently to std::map::find(): if the key isn't found the former returns the lower bound while the latter returns end(). To fix this, associative::find() needs to be changed to become something like this:

auto rc = std::lower_bound(this->internal_.begin(), this->internal_.end(), key, this->comp_);
return rc != this->internal_.end() && !this->comp_(key, rc->first)? rc: this->internal_.end();

现在,我们更有可能比较苹果苹果(如果逻辑实在是正确的,现在我还没有验证),让我们去调查的性能。我不太相信,这种方法用于测试高性能真正站得住脚,但我会继续使用它,现在我可以肯定提高了关联容器的性能。我不认为我已经挺在code中所有的性能问题,但是,至少,取得了一些进展。最大的是注意到,在关联是pretty的不好,因为它使复印使用的比较功能。这是处于劣势把这个容器一些。如果您正在检查的比较,现在你可能看不到它,因为它看起来好像这个比较引用传递!这个问题其实是相当微妙:基础容器有一个的value_type 的std ::对&LT;为key_type,mapped_type&GT; 但比较正在的std ::对&LT; key_type的常量,mapped_type&GT; 作为参数!修复这似乎给关联容器相当多的性能提升。

Now that we are more likely to compare apples to apples (I haven't verified if the logic is really correct now), let's go on to investigate the performance. I'm not quite convinced that the approach used to test performance really hold water but I'm sticking with it for now and I could definitely improve the performance of the associative container. I don't think I have quite found all performance issues in the code but, at least, made some progress. The biggest is to noticed that the comparison function used in the associative is pretty bad because it keeps making copies. This is putting this container somewhat at a disadvantage. If you are checking the comparator now you probably don't see it because it looks as if this comparator is passing by reference! The issue is actually rather subtle: the underlying container has a value_type of std::pair<key_type, mapped_type> but the comparator is taking std::pair<key_type const, mapped_type> as argument! Fixing this seems to give the associative container quite a bit of a performance boost.

要实现它有没有机会失败精确匹配我用一个简单的辅助参数检测一个类型是的std ::对℃的比较级,L,R&GT;

To implement a comparator class which has not opportunity to fail matching the arguments exactly I'm using a simple helper to detect if a type is a std::pair<L, R>:

template <typename>               struct is_pair                  { enum { value = false }; };
template <typename F, typename S> struct is_pair<std::pair<F, S>> { enum { value = true }; };

...然后我换成比较这个,稍微复杂一些,一:

... and then I replaced the comparator with this, slightly more complicated, one:

class value_compare {
    key_compare pred_;
public:
    inline value_compare(key_compare pred=key_compare()) : pred_(pred) {}
    template <typename L, typename R>
    inline typename std::enable_if<is_pair<L>::value && is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left.first,right.first);
    }
    template <typename L, typename R>
    inline typename std::enable_if<is_pair<L>::value && !is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left.first,right);
    }
    template <typename L, typename R>
    inline typename std::enable_if<!is_pair<L>::value && is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left,right.first);
    }
    template <typename L, typename R>
    inline typename std::enable_if<!is_pair<L>::value && !is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left,right);
    }
    inline key_compare key_comp( ) const {return pred_;}
};

这通常得到两种做法颇有几分紧密地结合在一起。考虑到我所期望的是,的std ::矢量&lt; T&GT; LOWER_BOUND()的做法应该是好了很多,比使用的std ::地图&LT; K,T&GT; 我觉得调查还没有结束,但

This generally gets the two approaches quite a bit closer together. Given that I would expect that the std::vector<T> with lower_bound() approach should be a lot better than using std::map<K, T> I feel the investigation isn't over, yet.

附录

反思运动多一点,我看到为什么我觉得不舒服的predicate类的实现:它是办法以复杂!这可以做了很多通过简单的不可以中使用的std :: enable_if 一变:这很好地降低了code到一些东西,是更容易阅读。关键是要拿到钥匙:

Rethinking the exercise a bit more I spotted why I felt uncomfortable with the implementation of the predicate class: it is way to complex! This can be done a lot simpler by not using std::enable_if for a change: this nicely reduces the code to something which is much easier to read. The key is to get the key:

template <typename Key>
Key const& get_key(Key const& value)                  { return value; }
template <typename Key,  typename Value>
Key const& get_key(std::pair<Key, Value> const& pair) { return pair.first; }

通过这个实现弄个从一个值或一对值的钥匙,在predicate对象可以只定义一个很简单的函数调用操作:

With this implementation to get hold of a "key" from either a value or a pair of values, the predicate object can just define one very simple function call operator:

template <typename L, typename R>
bool operator()(L const& l, R const& r)
{
    return this->pred_(get_key<key_type>(l), get_key<key_type>(r));
}

目前在这个略有招为好,但:预期为key_type 需要传递给 get_key()功能。如果没有这个的predicate不会的情况下工作,其中为key_type 本身就是一个的std ::对&LT; F,S&GT; 。

There is a slight trick in this as well, though: the expected key_type needs to be passed to the get_key() function. Without this the predicate wouldn't work in cases where the key_type is itself a std::pair<F, S> of objects.

这篇关于的std :: LOWER_BOUND慢性病::向量不是性病::地图::找到的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆