什么是uint64_t键的最好的哈希函数从0到其最大值? [英] What is the best hash function for uint64_t keys ranging from 0 to its max value?

查看:204
本文介绍了什么是uint64_t键的最好的哈希函数从0到其最大值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一组元素并想要将它们存储在哈希映射中(例如 std :: unoredered_set ),每个元素都有一个键类型 uint64_t 哪个值可以从0变化到其最大可能值,是使用简单哈希函数的最佳选择,其中键的哈希值是键本身?它是否依赖于使用中的容器(即Google的稀疏哈希与STL的无序映射)?键值的出现概率是未知的。

Assuming that we have a set of elements and want to store them in a hash map (for example std::unoredered_set), and each element has a key of type uint64_t which value can vary from 0 to its max possible value, is it the best choice to use trivial hash function, where a hash value of a key is a key itself? Does it depend on container in use (i.e. Google's sparse hash vs unordered map from STL)? The probability of appearance of key values is unknown.

推荐答案

如果所有你必须哈希是一个uint64_t任何可能的值,概率,并且您的输出必须是uint64_t,那么您不会通过更改值获得任何优势。只需使用键本身。

If all you have to hash is a uint64_t of any possible value with unknown probabilities, and your output must be a uint64_t, then you don't gain any advantage by changing the value. Just use the key itself.

如果你知道一些关于你的值的分布,或者你的值被限制在一个较小的范围(这是真正的知道分布),那么对键应用变换可能是有益的,但这取决于容器的实现。只有当表将哈希转换为桶索引时,才会减少冲突,但这取决于表的算法和表的当前/平均状态(每个桶使用的频率)。

If you knew something about the distribution of your values or your values were restricted to a smaller range (which is really the same thing as knowing about the distribution), then it could be beneficial to apply a transformation to the key, but this depends on the implementation of the container. You would only benefit by reducing collisions when the table transforms a hash into a bucket index, but that depends both on the table's algorithm and the current/average state of the table (how often each bucket is used).

这篇关于什么是uint64_t键的最好的哈希函数从0到其最大值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆