最好的哈希函数uint64_t中键范围从0到最大值是什么? [英] What is the best hash function for uint64_t keys ranging from 0 to its max value?

查看:3314
本文介绍了最好的哈希函数uint64_t中键范围从0到最大值是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一组元素,并希望将它们存储在一个哈希表(例如的std :: unoredered_set ),每个元素都有类型的密钥 uint64_t中该值可以从0变化到其最大可能值,它是用琐碎的杂凑函数是最好的选择,其中一个关键的哈希值是一个关键的本身呢?是否依赖于容器中,使用(即谷歌的稀疏散VS从STL无序图)?外观键值的概率是未知的。

Assuming that we have a set of elements and want to store them in a hash map (for example std::unoredered_set), and each element has a key of type uint64_t which value can vary from 0 to its max possible value, is it the best choice to use trivial hash function, where a hash value of a key is a key itself? Does it depend on container in use (i.e. Google's sparse hash vs unordered map from STL)? The probability of appearance of key values is unknown.

推荐答案

如果你所散列的任何可能的值未知的概率一个uint64_t中,和你的输出必须是uint64_t中,那么你就不会获得任何优势通过改变该值。只需使用密钥本身。

If all you have to hash is a uint64_t of any possible value with unknown probabilities, and your output must be a uint64_t, then you don't gain any advantage by changing the value. Just use the key itself.

如果你知道一些关于你的价值观或你的价值观的分布仅限于一个较小的范围(这实在是一回事寂寂分布),那么它可能是有益的应用转型的关键,但这取决于容器的执行。你不仅会通过减少当表转换散列成桶指数碰撞中受益,但在桌子上的算法和表(频率每个桶时)的电流/平均状态取决于双方。

If you knew something about the distribution of your values or your values were restricted to a smaller range (which is really the same thing as knowing about the distribution), then it could be beneficial to apply a transformation to the key, but this depends on the implementation of the container. You would only benefit by reducing collisions when the table transforms a hash into a bucket index, but that depends both on the table's algorithm and the current/average state of the table (how often each bucket is used).

这篇关于最好的哈希函数uint64_t中键范围从0到最大值是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆