更改为HashMap的哈希函数在Java中8 [英] Change to HashMap hash function in Java 8

查看:116
本文介绍了更改为HashMap的哈希函数在Java中8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Java的8 java.util.HashMap中我注意到一个变化<一个href=\"http://grep$c$c.com/file/repository.grep$c$c.com/java/root/jdk/openjdk/7-b147/java/util/HashMap.java#HashMap.hash%28int%29\">from:

In java 8 java.util.Hashmap I noticed a change from:

static int hash(int h) {
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);

<一个href=\"http://grep$c$c.com/file/repository.grep$c$c.com/java/root/jdk/openjdk/8-b132/java/util/HashMap.java#HashMap.hash%28java.lang.Object%29\">to:

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);

这似乎从code,新功能是一个简单的 XOR 与上部16离开高16位不变,而不是低16位的到在previous实现,并从评论,这是在散列函数的结果具有大量在较低位,以不同的桶碰撞分配效果较差,但通过具有少做节省CPU周期数不同的偏移操作。

It appears from the code that the new function is a simpler XOR of the lower 16 bits with the upper 16 leaving the upper 16 bits unchanged, as opposed to several different shifts in the previous implementation, and from the comments that this is less effective at allocating the results of hash functions with a high number of collisions in lower bits to different buckets, but saves CPU cycles by having to do less operations.

我在发行说明中看到的是 从链表变成平衡树来存储碰撞密钥的唯一的事(我认为可能已经改变的时候这是有道理的花费计算好的哈希)的金额,我是在如果有此变化对大的哈希地图的任何预期的性能的影响看特别感兴趣。有没有关于这种变化的任何信息,或有没有人用散列函数有更好的认识有什么这种变化的影响可能是一个想法(如果有的话,也许我只是误解了code),如果有任何需要移动到Java 8时,产生不同的方式来维持性能哈希codeS?

The only thing I saw in the release notes was the change from linked lists to balanced trees to store colliding keys (which I thought might have changed the amount of time it made sense to spend calculating a good hash), I was specifically interested in seeing if there was any expected performance impact from this change on large hash maps. Is there any information about this change, or does anyone with a better knowledge of hash functions have an idea of what the implications of this change might be (if any, perhaps I just misunderstood the code) and if there was any need to generate hash codes in a different way to maintain performance when moving to Java 8?

推荐答案

正如你指出的:有在<一个描述的HashMap A显著的性能提升在Java中8 HREF =htt​​p://openjdk.java.net/jeps​​/180相对=nofollow> JEP-180 。基本上,如果一个哈希链越过一定的规模,在的HashMap 将(如果可能)与二叉树更换。这让各种操作的最坏情况的行为 O(日志N)而不是 O(N)

As you noted: there is a significant performance improvement in HashMap in Java 8 as described in JEP-180. Basically, if a hash chain goes over a certain size, the HashMap will (where possible) replace it with a binary tree. This makes the "worst case" behaviour of various operations O(log N) instead of O(N).

这并不直接说明更改为。不过,我的或推测的,在JEP-180的优化手段,由于命中被分布不均哈希函数的性能是次要的,而对于成本效益分析哈希方法的变化;即更复杂的版本不太有利的平均的。 (熊绑定,当密钥类型的散code 方法生成高质量codeS,那么的复杂的版本体操哈希方法是在浪费时间。)

This doesn't directly explain the change to hash. However, I would hypothesize that the optimization in JEP-180 means that the performance hit due to a poorly distributed hash function is less important, and that the cost-benefit analysis for the hash method changes; i.e. the more complex version is less beneficial on average. (Bear in bind that when the key type's hashcode method generates high quality codes, then gymnastics in the complex version of the hash method are a waste of time.)

不过,这只是一个理论。为哈希的真正理由的变化是最有可能的Oracle机密。

But this is only a theory. The real rationale for the hash change is most likely Oracle confidential.

这篇关于更改为HashMap的哈希函数在Java中8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆