为什么在GetHashCode实现中使用初始素数? [英] Why is an initial prime used in GetHashCode implementations?

查看:89
本文介绍了为什么在GetHashCode实现中使用初始素数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

查看什么是最佳算法对于一个被覆盖的System.Object.GetHashCode?我很惊讶,在许多提示类型为hash = hash*(prime) + item.GetHashcode()的哈希码的答案中,哈希值最初都植入了另一个素数而不是0.

我了解计算部分中互质数有用的原因有很多.

我不明白的是为什么哈希首先要初始化为非零数字.

看一个确切的例子:

int hash = 17;
hash = hash * 23 + field1.GetHashCode();
hash = hash * 23 + field2.GetHashCode();
hash = hash * 23 + field3.GetHashCode();
return hash;

为简写起见,让field1.GetHashCode()用f1表示(以此类推,以此类推),并用初始哈希值表示,然后给出:

int hash = i;
hash = i * 23 + f1;
hash = (i * 23 + f1)* 23 + f2;
hash = ((i * 23 + f1)* 23 + f2)* 23 + f3;

扩展最后一行的括号:

hash = (i*23*23 + f1*23 + f2)* 23 + f3;
hash = i*23*23*23 + f1*23*23 + f2*23 + f3;

因此,我们可以看到,初始哈希值的唯一作用是将最终has值增加i * 23 * 23 * 23的常数,该常数将推广为i * 23 ^(字段数). /p>

那么这有什么帮助?在f1,f2,f3全部为0的情况下,如果最终哈希为0,是否有问题?使其为非零更好吗?我唯一的想法是,由于某种原因,使用哈希值的字典或哈希集之类的东西的实现会首选非零值,但我不认为该原因可能是什么.或其他事情,当然这些东西有些不可思议,因此人们使用久经考验的东西,即使没有理由,初始值也会传播出去.

我尝试查找一些Microsoft哈希码,但是我发现所有哈希码都使用外部代码来计算它们(对象,字符串)或有些特殊(匿名对象上GetHashCode的实现基于哈希码的属性名称来植入哈希码)匿名对象,因为它不是一个恒定的初始值,所以有所不同.

总而言之,为什么哈希代码实现中会使用初始常量值?

修改:为什么在其中使用质数有人建议将hashCode?作为重复项,并且该站点要我编辑我的问题以解释为什么它不是重复项...我已经承认素数在计算中用作乘数,我知道为什么会这样.这个问题明确地与在哈希码算法中用作初始种子有关.建议的重复项没有明确说明质数的用途,但所有答案都解决了将其用作与该问题无关的乘数的问题.

此问题具有What is the best algorithm for an overridden System.Object.GetHashCode? I was struck that in many of the answers that suggest hashcodes of the type hash = hash*(prime) + item.GetHashcode() that the value of hash is initially seeded to another prime rather than 0.

I understand the reason for the prime in the calculation portion coprime numbers are useful in many ways.

What I don't understand is why the hash is initialised to a non-zero number in the first place.

Looking at the precise example:

int hash = 17;
hash = hash * 23 + field1.GetHashCode();
hash = hash * 23 + field2.GetHashCode();
hash = hash * 23 + field3.GetHashCode();
return hash;

For shorthand lets let field1.GetHashCode() be represented with f1 (and so on for the others) and the initial hash value as i then this gives:

int hash = i;
hash = i * 23 + f1;
hash = (i * 23 + f1)* 23 + f2;
hash = ((i * 23 + f1)* 23 + f2)* 23 + f3;

Expanding the brackets in that last row:

hash = (i*23*23 + f1*23 + f2)* 23 + f3;
hash = i*23*23*23 + f1*23*23 + f2*23 + f3;

So as we can see the only effect of the initial hash value is to increase the final has value by a constant value of i*23*23*23 which would generalise to i*23^(number of fields).

So how does this help? In the event of f1, f2, f3 all being 0 is it a problem if the final hash were 0? Is it better for it to be something non-zero? My only thought is that implementations of things like dictionaries or hash sets that use the hash value prefer non-zero values for some reason but I can't think what that reason might be. Or the other things of course that these things are a little bit arcane so people use a tried and tested thing and so the initial value gets propagated even though there is no reason for it.

I tried looking up some microsoft hashcodes but the ones I found all used external code to calculate them (object, string) or were slightly special (the implementation of GetHashCode on anonymous objects seeds the hashcode based off of the property names of the anonymous objects which is different because it isn't a constant initial value).

So in summary why the initial constant value in hash code implementations?

Edit: Why use a prime number in hashCode? was suggested as a duplicate and the site wants me to edit my question to explain why it is not a duplicate... I have acknowledged that primes are used as the multiplier in the calculations and I understand why that is. This question is explicitly about the use as an initial seed in the hash code algorithm. The suggested duplicate doesn't explicitly say what the prime is used for but the answers all address the use of it as a multiplying factor which is not relevant to this question.

解决方案

This question has some good answers on the Computer Science SE. In short: the initial constant is adapted from hashes that could take a variable number of inputs, and you’re right that it doesn’t matter in that example.

这篇关于为什么在GetHashCode实现中使用初始素数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆