与std :: hash的意外冲突 [英] Unexpected collision with std::hash

查看:572
本文介绍了与std :: hash的意外冲突的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道hashing无限数量的字符串到32b int必须产生碰撞,但我希望从hashing函数一些不错的分布。

I know hashing infinite number of string into 32b int must generate collision, but I expect from hashing function some nice distribution.

这些2是不奇怪的

size_t hash0 = std::hash<std::string>()("generated_id_0");
size_t hash1 = std::hash<std::string>()("generated_id_1");
//hash0 == hash1



我知道我可以使用 boost :: hash< std :: string> 或其他,但我想知道 std :: hash 有什么问题。我使用它错了吗?

I know I can use boost::hash<std::string> or others, but I want to know what is wrong with std::hash. Am I using it wrong? Shouldn't I somehow "seed" it?

推荐答案

使用 std没有问题: :hash 。问题是,由Visual Studio 2010捆绑的标准库实现提供的专业化 std :: hash< std :: string> 只需要字符串字符的一个子集来确定哈希值(大概是出于性能原因)。巧合的是,带有14个字符的字符串的最后一个字符不是这个集合的一部分,这就是为什么这两个字符串产生相同的哈希值。

There's nothing wrong with your usage of std::hash. The problem is that the specialization std::hash<std::string> provided by the standard library implementation bundled with Visual Studio 2010 only takes a subset of the string's characters to determine the hash value (presumably for performance reasons). Coincidentally the last character of a string with 14 characters is not part of this set, which is why both strings yield the same hash value.

据我所知符合标准,这要求仅对具有相同参数的散列函数的多个调用必须总是返回相同的值。然而,哈希冲突的概率应该是最小的。 VS2010实现满足强制性部分,但未能考虑可选的。

As far as I know this behaviour is in conformance with the standard, which demands only that multiple calls to the hash function with the same argument must always return the same value. However, the probability of a hash collision should be minimal. The VS2010 implementation fulfills the mandatory part, yet fails to account for the optional one.

有关详细信息,请参阅头文件中的实现 xfunctional (从我的副本中的第869行开始)和C ++标准的§17.6.3.4(

For details, see the implementation in the header file xfunctional (starting at line 869 in my copy) and §17.6.3.4 of the C++ standard (latest public draft).

如果你绝对需要一个更好的字符串哈希函数,你应该自己实现它。它实际上是不是那么难

If you absolutely need a better hash function for strings, you should implement it yourself. It's actually not that hard.

这篇关于与std :: hash的意外冲突的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆