为什么以及如何HashMap有自己的内部实现hashCode()称为哈希()? [英] Why and how does HashMap have its own internal implementation of hashCode() called hash()?

查看:324
本文介绍了为什么以及如何HashMap有自己的内部实现hashCode()称为哈希()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据此博客条目 ,HashMap在已经检索到的哈希码上重新调用自己的 hashCode()实现(称为 hash())。 / b>


如果key不为空,那么它将在关键对象上调用散列函数,见上面方法的第4行,即key.hashCode(),因此,在key.hashCode()返回hashValue之后,第4行看起来像

int hash = hash(hashValue)

现在,它将返回的hashValue应用到它自己的哈希函数中。

我们可能想知道为什么我们使用hash(hashValue)再次计算哈希值。答案是,它可以防范低质量的hash>函数。

准确地可以重新分配哈希码吗? HashMap可以存储对象,但它无法访问为其对象分配hashCode的逻辑。例如, hash()不可能将逻辑整合到下面的 hashCode()实现中:

  public class Employee {
protected long employeeId;
保护字符串firstName;
保护字符串lastName;

public int hashCode(){
return(int)employeeId;
}

}


解决方案

hash()从实际的哈希码派生出改进的哈希码,所以相同的输入总是等于输出(来自jdk1.8.0_51):

  static final hash(Object key){
int h;
return(key == null)? 0:(h = key.hashCode())^(h>>> 16);至于为什么哈希码需要改进,请阅读该方法的javadoc:


$ b


计算key.hashCode(),并散布(XORs)较高位的散列至较低位。由于该表使用幂级别掩码,所以只在当前掩码之上的位上发生变化的散列总是会发生冲突。 (在已知的例子中,Float键的集合在小表中保持连续的整数)。因此,我们应用一个变换向下扩展高位的影响。在速度,效用和比特传播质量之间进行权衡。因为许多常见的散列集合已经合理分布(所以不能从扩散中受益),并且因为我们使用树来处理箱子中的大量碰撞,所以我们只需以最便宜的方式对一些位移进行异或运算,以减少系统损失,以及合并由于表边界而不会用于索引计算的最高位的影响。



According to this blog entry, HashMap reinvokes its own implementation of hashCode() (called hash()) on a hashcode it already retrieved.

If key is not null then , it will call hashfunction on the key object , see line 4 in above method i.e. key.hashCode() ,so after key.hashCode() returns hashValue , line 4 looks like

int hash = hash(hashValue)

and now ,it applies returned hashValue into its own hashing function .

We might wonder why we are calculating the hashvalue again using hash(hashValue). Answer is ,It defends against poor quality hash >functions.

Can HashMap accurately reassign hashcodes? HashMap can store objects, but it doesn't have access to the logic that assigns a hashCode its objects. For example, hash() couldn't possibly integrate the logic behind the following hashCode() implementation:

public class Employee {
protected long employeeId;
protected String firstName;
protected String lastName;

public int hashCode(){
    return (int) employeeId;
}

}

解决方案

The hash() derives the "improved" hash code from the actual hash code, so equal input will always be equal output (from jdk1.8.0_51):

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

As to why the hash code needs improvement, read the javadoc of the method:

Computes key.hashCode() and spreads (XORs) higher bits of hash to lower. Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide. (Among known examples are sets of Float keys holding consecutive whole numbers in small tables.) So we apply a transform that spreads the impact of higher bits downward. There is a tradeoff between speed, utility, and quality of bit-spreading. Because many common sets of hashes are already reasonably distributed (so don't benefit from spreading), and because we use trees to handle large sets of collisions in bins, we just XOR some shifted bits in the cheapest possible way to reduce systematic lossage, as well as to incorporate impact of the highest bits that would otherwise never be used in index calculations because of table bounds.

这篇关于为什么以及如何HashMap有自己的内部实现hashCode()称为哈希()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆