快速哈希函数`std :: vector` [英] Fast hash function for `std::vector`

查看:240
本文介绍了快速哈希函数`std :: vector`的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我实现了这个解决方案,用于从 vector< T> 获取散列值:

  namespace std 
{
template< typename T>
struct hash< vector< T>>
{
typedef vector< T> argument_type;
typedef std :: size_t result_type;
result_type operator()(argument_type const& in)const
{
size_t size = in.size();
size_t seed = 0;
for(size_t i = 0; i< size; i ++)
//将当前矢量的散列与前一个散列的散列组合在一起
hash_combine(seed,in [i] );
归还种子;
}
};
}

//使用boost :: hash_combine
模板< class T>
inline void hash_combine(std :: size_t& seed,T const& v)
{
seed ^ = std :: hash< T>()(v)+ 0x9e3779b9 +(seed< ;< 6)+(种子>> 2);
}

但是这个解决方案根本无法扩展:使用 vector< double> 1000万个元素需要超过2.5秒(根据VS)。

这种情况下的功能?注意,从向量引用创建哈希值不是一个可行的解决方案,因为相关的 unordred_map 将会是在不同的运行中使用,另外两个 vector< double> 具有相同的内容,但不同的地址将以不同的方式进行映射(此应用程序的行为不合需要)。 $ b

解决方案

由于 per < a> 评论,通过使用opt进行编译,可以获得25-50倍的加速imizations。首先这样做。 然后,如果它仍然太慢,请参阅下面的内容。






我不认为有你可以做很多事情。您 可以触摸所有元素,并且该组合函数的速度与其获得的速度一样快。

一个选项可能是并行散列函数。如果你有8个内核,你可以运行8个线程到每个哈希矢量的1/8,然后结合8个结果值。对于非常大的向量,同步开销可能是值得的。

I implemented this solution for getting an hash value from vector<T>:

namespace std
{
    template<typename T>
    struct hash<vector<T>>
    {
        typedef vector<T> argument_type;
        typedef std::size_t result_type;
        result_type operator()(argument_type const& in) const
        {
            size_t size = in.size();
            size_t seed = 0;
            for (size_t i = 0; i < size; i++)
                //Combine the hash of the current vector with the hashes of the previous ones
                hash_combine(seed, in[i]);
            return seed;
        }
    };
}

//using boost::hash_combine
template <class T>
inline void hash_combine(std::size_t& seed, T const& v)
{
    seed ^= std::hash<T>()(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);
}

But this solution doesn't scale at all: with a vector<double> of 10 millions elements it's gonna take more than 2.5 s (according to VS).

Does exists a fast hash function for this scenario?

Notice that creating an hash value from the vector reference is not a feasible solution, since the related unordred_map will be used in different runs and in addition two vector<double> with the same content but different addresses will be mapped differently (undesired behavior for this application).

解决方案

NOTE: As per the comments, you get a 25-50x speed-up by compiling with optimizations. Do that, first. Then, if it's still too slow, see below.


I don't think there's much you can do. You have to touch all the elements, and that combination function is about as fast as it gets.

One option may be to parallelize the hash function. If you have 8 cores, you can run 8 threads to each hash 1/8th of the vector, then combine the 8 resulting values at the end. The synchronization overhead may be worth it for very large vectors.

这篇关于快速哈希函数`std :: vector`的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆