std :: lower_bound慢于std :: vector比std :: map :: find [英] std::lower_bound slower for std::vector than std::map::find

查看:248
本文介绍了std :: lower_bound慢于std :: vector比std :: map :: find的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个类作为一个顺序容器周围的包装( std :: vector / std :: queue / std :: list )具有 std :: map 的界面,用于使用小型数量小的物体。鉴于已经存在的算法,编码非常简单。

  template< class key_,
class mapped_,
class traits_ = std :: less< key_>,
class undertype_ = std :: vector< std :: pair< key_,mapped_> >
>
class associative
{
public:
typedef traits_ key_compare;
typedef key_key_type;
typedef mapped_ mapped_type;
typedef std :: pair< const key_type,mapped_type>值类型;
typedef typename undertype _ :: allocator_type allocator_type;
typedef typename allocator_type :: template rebind< value_type> :: other value_allocator_type;
typedef typename undertype _ :: const_iterator const_iterator;

class value_compare {
key_compare pred_;
public:
inline value_compare(key_compare pred = key_compare()):pred_(pred){}
inline bool operator()(const value_type& left,const value_type& right)const {return pred_(left.first,right.first);}
inline bool operator()(const value_type& left,const key_type& right)const {return pred_(left.first,right);}
inline bool operator()(const key_type& left,const value_type& right)const {return pred_(left,right.first);}
inline bool operator()(const key_type& left,const key_type& right) return pred_(left,right);}
inline key_compare key_comp()const {return pred_;}
};
class iterator {
public:
typedef typename value_allocator_type :: difference_type difference_type;
typedef typename value_allocator_type :: value_type value_type;
typedef typename value_allocator_type :: reference reference;
typedef typename value_allocator_type ::指针;
typedef std :: bidirectional_iterator_tag iterator_category;
inline iterator(const typename undertype _ :: iterator& rhs):data(rhs){}
inline reference operator *()const {return reinterpret_cast< reference>(* data);}
inline pointer operator->()const {return reinterpret_cast< pointer>(structure_dereference_operator(data));}
运算符const_iterator&()const {return data;}
protected:
typename undertype_ ::迭代器数据;
};

template< class input_iterator>
inline associative(input_iterator first,input_iterator last):internal_(first,last),comp_()
{if(std :: is_sorted(internal_.begin(),internal_.end())== false)std :: sort(internal_.begin(),internal_.end(),comp_);}

内联迭代器find(const key_type& key){
iterator i = std: :lower_bound(internal_.begin(),internal_.end(),key,comp_);
return(comp_(key,* i)?internal_.end():i);
}

protected:
undertype_ internal_;
value_compare comp_;
};

SSCCE at http://ideone.com/Ufn7r http://ideone.com/的完整代码MQr0Z (注意:作为IdeOne的结果时间非常不稳定,可能是由于服务器负载,并且不清楚显示所讨论的结果)



std :: string ,POD从4到128个字节,从8到2000个元素与MSVC10。



我希望(1)从小对象的范围创建更高的性能,(2)对少量小对象进行随机插入/删除,以及(3)查找所有对象。令人惊讶的是,从所有测试的范围创建向量显着更快,随着大小可达2048字节(512个4字节对象或128个16字节的对象),随机擦除更快,等等)。然而,令人震惊的是,使用 std :: lower_bound std :: vector std :: map :: find 为所有POD。对于4和8字节的POD,差异不大,但对于128字节的POD, std :: vector 的速度降低了36%!然而,对于 std :: string std :: vector 平均速度提高了6%。



我在一个排序的 std :: vector std :: lower_bound >由于更好​​的缓存位置/较小的内存大小,应该具有优于 std :: map ,并且由于映射可以不完全平衡,或者在最坏情况下应该匹配 std :: map ,但不能为了生活我认为 std :: map 应该更快的原因。我唯一的想法是谓语是以某种方式放慢速度,但我不知道如何。所以问题是:在一个排序的 std :: vector 上可能是 std :: lower_bound std :: map (在MSVC10中)表现优于



ve确认 std :: vector< std :: pair< 4BYTEPOD,4BYTEPOD>> 上的 std :: lower_bound std :: map< 4BYTEPOD,4BYTEPOD> :: find (由0-0.25)平均使用较少的比较,但是我的实现仍然是高达26%的速度。



[POST-ANSWER-EDIT]我在 http://ideone.com/41iKt 删除所有不需要的绒毛,并清楚地显示在排序的 >矢量地图的速度慢了〜15%。

解决方案

这是一个更有意思的一个破解!在讨论我的发现之前,让我指出, associative :: find()函数的行为与 std :: map :: find( ):如果未找到该键,则前者返回下限,后者返回 end()。要解决这个问题, associative :: find()需要更改为这样:

  auto rc = std :: lower_bound(this-> internal_.begin(),this-> internal_.end(),key,this-> comp_); 
return rc!= this-> internal_.end()&& !this-> comp_(key,rc-> first)? rc:this-> internal_.end();

现在我们更有可能将苹果与苹果进行比较(我没有验证逻辑是否现在真的很正确),让我们继续研究一下表现。我不太相信用于测试性能的方法真的有水,但我现在坚持下去,我可以肯定地提高关联容器的性能。我不认为我在代码中找不到所有的性能问题,但至少取得了一些进展。最大的是注意到,关联中使用的比较功能非常糟糕,因为它不断复制。这使得容器有些不利。如果你正在检查比较器,你可能看不到它,因为它看起来好像比较 通过引用!这个问题实际上是微妙的:底层容器有一个 value_type std :: pair< key_type,mapped_type> 比较器将 std :: pair< key_type const,mapped_type> 作为参数。解决这个问题似乎给关联容器提供了很大的提升。



要实现一个比较类,没有机会失败匹配我正在使用的参数一个简单的帮手来检测一个类型是否是一个 std :: pair< L,R>

 模板< typename> struct is_pair {enum {value = false}; }; 
template< typename F,typename S> struct is_pair< std :: pair< F,S>> {enum {value = true}; };

...然后我更换了比较器,稍微复杂一点:

  class value_compare {
key_compare pred_;
public:
inline value_compare(key_compare pred = key_compare()):pred_(pred){}
template< typename L,typename R>
inline typename std :: enable_if< is_pair< L> :: value&& is_pair< R> :: value,bool> :: type
operator()(L const& left,R const& right)const {
return pred_(left.first,right.first);
}
template< typename L,typename R>
inline typename std :: enable_if< is_pair< L> :: value&& !is_pair< R> :: value,bool> :: type
operator()(L const& left,R const& right)const {
return pred_(left.first,right)
}
template< typename L,typename R>
内联typename std :: enable_if<!is_pair< L> :: value&& is_pair< R> :: value,bool> :: type
operator()(L const& left,R const& right)const {
return pred_(left,right.first);
}
template< typename L,typename R>
内联typename std :: enable_if<!is_pair< L> :: value&& !is_pair< R> :: value,bool> :: type
operator()(L const& left,R const& right)const {
return pred_(left,right)
}
inline key_compare key_comp()const {return pred_;}
};

这通常会使两种方法相距较近。鉴于我希望 std :: vector< T> lower_bound()方法应该更好而不是使用 std :: map< K,T> 我觉得调查还没有结束,但是。



附录



再次反思练习我发现为什么我对谓词类的实现感到不舒服:它是方式复杂!使用 std :: enable_if 进行更改可以通过进行更简单的操作:这很好地将代码减少到更容易读。关键是获取密钥:

  template< typename Key> 
关键常量& get_key(Key const& value){return value; }
template< typename Key,typename值>
关键常量& get_key(std :: pair< Key,Value> const& pair){return pair.first; }

通过这个实现,可以从一个值或一对值获取一个key ,谓词对象可以定义一个非常简单的函数调用运算符:

  template< typename L,typename R> 
bool operator()(L const& l,R const& r)
{
return this-> pred_(get_key< key_type>(l),get_key< key_type& );
}

尽管如此,还有一点点技巧:预期的 key_type 需要传递给 get_key()函数。没有这一点,在 key_type 本身就是一个 std :: pair< F,S> 对象。


I wrote a class to act as a wrapper around a sequential container (std::vector/std::queue/std::list) to have the interface of a std::map, for performance when using small numbers of small objects. The coding was all remarkably simple given the algorithms that already existed. This code is obviously highly trimmed from my full code, but shows the problem.

template <class key_, 
          class mapped_, 
          class traits_ = std::less<key_>,
          class undertype_ = std::vector<std::pair<key_,mapped_> >
         >
class associative
{
public:
    typedef traits_ key_compare;
    typedef key_ key_type;
    typedef mapped_ mapped_type;
    typedef std::pair<const key_type, mapped_type> value_type;
    typedef typename undertype_::allocator_type allocator_type;
    typedef typename allocator_type::template rebind<value_type>::other value_allocator_type;
    typedef typename undertype_::const_iterator const_iterator;

    class value_compare {
        key_compare pred_;
    public:
        inline value_compare(key_compare pred=key_compare()) : pred_(pred) {}
        inline bool operator()(const value_type& left, const value_type& right) const {return pred_(left.first,right.first);}
        inline bool operator()(const value_type& left, const key_type& right) const {return pred_(left.first,right);}
        inline bool operator()(const key_type& left, const value_type& right) const {return pred_(left,right.first);}
        inline bool operator()(const key_type& left, const key_type& right) const {return pred_(left,right);}
        inline key_compare key_comp( ) const {return pred_;}
    };
    class iterator  {
    public:       
        typedef typename value_allocator_type::difference_type difference_type;
        typedef typename value_allocator_type::value_type value_type;
        typedef typename value_allocator_type::reference reference;
        typedef typename value_allocator_type::pointer pointer;
        typedef std::bidirectional_iterator_tag iterator_category;
        inline iterator(const typename undertype_::iterator& rhs) : data(rhs) {}
    inline reference operator*() const { return reinterpret_cast<reference>(*data);}
        inline pointer operator->() const {return reinterpret_cast<pointer>(structure_dereference_operator(data));}
        operator const_iterator&() const {return data;}
    protected:
        typename undertype_::iterator data;
    };

    template<class input_iterator>
    inline associative(input_iterator first, input_iterator last) : internal_(first, last), comp_() 
    {if (std::is_sorted(internal_.begin(), internal_.end())==false) std::sort(internal_.begin(), internal_.end(), comp_);}

inline iterator find(const key_type& key) {
    iterator i = std::lower_bound(internal_.begin(), internal_.end(), key, comp_);
    return (comp_(key,*i) ? internal_.end() : i);
}

protected:
    undertype_ internal_;
    value_compare comp_;
};

SSCCE at http://ideone.com/Ufn7r, full code at http://ideone.com/MQr0Z (note: resulting times as IdeOne are highly erratic, probably due to server load, and do not clearly show the results in question)

I tested with std::string, and PODs from 4 to 128 bytes, ranging from 8 to 2000 elements with MSVC10.

I expected higher performance for (1) creating from a range for small objects, (2) random insertion/erasure for small numbers of small objects, and (3) lookup for all objects. Surprisingly, the vector was significantly faster for creating from a range for all tests, and faster for random erasure depending on size up to about 2048 bytes (512 4-byte objects, or 128 16-byte objects, etc). However, most shocking of all, was that the std::vector using std::lower_bound was slower than the std::map::find for all PODs. The difference was miniscule for 4 and 8-byte PODs, and but for 128-byte PODs, std::vector was up to 36% slower! However, for std::string, the std::vector was 6% faster on average.

I feel like std::lower_bound on a sorted std::vector should have outperformed std::map due to better cache locality/smaller memory size, and since the map can be imperfectly balanced, or in the worst case it should match std::map, but can't for the life of me think of any reason that std::map should be faster. My only thought is the predicate is somehow slowing it down, but I can't figure out how. So the question is: How could it be that std::lower_bound on a sorted std::vector be outperformed by a std::map (in MSVC10)?

[EDIT] I've confirmed that std::lower_bound on std::vector<std::pair<4BYTEPOD,4BYTEPOD>> uses fewer comparisons on average than std::map<4BYTEPOD,4BYTEPOD>::find (by 0-0.25), but my implementation is still up to 26% slower.

[POST-ANSWER-EDIT] I made a SSCCE at http://ideone.com/41iKt that removes all unneeded fluff, and clearly shows that find on the sorted vector is slower than that of the map, by ~15%.

解决方案

This is a somewhat more interesting nut to crack! Before discussing my findings so far, let me point out that the associative::find() function behaves differently to std::map::find(): if the key isn't found the former returns the lower bound while the latter returns end(). To fix this, associative::find() needs to be changed to become something like this:

auto rc = std::lower_bound(this->internal_.begin(), this->internal_.end(), key, this->comp_);
return rc != this->internal_.end() && !this->comp_(key, rc->first)? rc: this->internal_.end();

Now that we are more likely to compare apples to apples (I haven't verified if the logic is really correct now), let's go on to investigate the performance. I'm not quite convinced that the approach used to test performance really hold water but I'm sticking with it for now and I could definitely improve the performance of the associative container. I don't think I have quite found all performance issues in the code but, at least, made some progress. The biggest is to noticed that the comparison function used in the associative is pretty bad because it keeps making copies. This is putting this container somewhat at a disadvantage. If you are checking the comparator now you probably don't see it because it looks as if this comparator is passing by reference! The issue is actually rather subtle: the underlying container has a value_type of std::pair<key_type, mapped_type> but the comparator is taking std::pair<key_type const, mapped_type> as argument! Fixing this seems to give the associative container quite a bit of a performance boost.

To implement a comparator class which has not opportunity to fail matching the arguments exactly I'm using a simple helper to detect if a type is a std::pair<L, R>:

template <typename>               struct is_pair                  { enum { value = false }; };
template <typename F, typename S> struct is_pair<std::pair<F, S>> { enum { value = true }; };

... and then I replaced the comparator with this, slightly more complicated, one:

class value_compare {
    key_compare pred_;
public:
    inline value_compare(key_compare pred=key_compare()) : pred_(pred) {}
    template <typename L, typename R>
    inline typename std::enable_if<is_pair<L>::value && is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left.first,right.first);
    }
    template <typename L, typename R>
    inline typename std::enable_if<is_pair<L>::value && !is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left.first,right);
    }
    template <typename L, typename R>
    inline typename std::enable_if<!is_pair<L>::value && is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left,right.first);
    }
    template <typename L, typename R>
    inline typename std::enable_if<!is_pair<L>::value && !is_pair<R>::value, bool>::type
    operator()(L const& left, R const& right) const {
        return pred_(left,right);
    }
    inline key_compare key_comp( ) const {return pred_;}
};

This generally gets the two approaches quite a bit closer together. Given that I would expect that the std::vector<T> with lower_bound() approach should be a lot better than using std::map<K, T> I feel the investigation isn't over, yet.

Addendum:

Rethinking the exercise a bit more I spotted why I felt uncomfortable with the implementation of the predicate class: it is way to complex! This can be done a lot simpler by not using std::enable_if for a change: this nicely reduces the code to something which is much easier to read. The key is to get the key:

template <typename Key>
Key const& get_key(Key const& value)                  { return value; }
template <typename Key,  typename Value>
Key const& get_key(std::pair<Key, Value> const& pair) { return pair.first; }

With this implementation to get hold of a "key" from either a value or a pair of values, the predicate object can just define one very simple function call operator:

template <typename L, typename R>
bool operator()(L const& l, R const& r)
{
    return this->pred_(get_key<key_type>(l), get_key<key_type>(r));
}

There is a slight trick in this as well, though: the expected key_type needs to be passed to the get_key() function. Without this the predicate wouldn't work in cases where the key_type is itself a std::pair<F, S> of objects.

这篇关于std :: lower_bound慢于std :: vector比std :: map :: find的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆