HashMap Java 8实现 [英] HashMap Java 8 implementation

查看:96
本文介绍了HashMap Java 8实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据以下链接文档:

如果我们的hashcode函数是好的,它将提供一个均匀分布,所以所有的桶都会被平等地使用。在这种情况下,存储桶使用链接列表来存储这些值。





但是你不能依靠人来实现好的散列函数。人们通常会编写糟糕的散列函数,这将导致非均匀分布。





这个分布越少,我们从O(1)操作开始进一步移动,越接近O(n)操作。

Hashmap的实现尝试通过将一些桶组织到树中而不是链接列表来缓解这种情况,如果桶变得太大。这就是 TREEIFY_THRESHOLD = 8 的用途。如果一个存储桶包含八个以上的项目,它应该成为一棵树。





这棵树是一棵红黑树。它首先按哈希码进行排序。如果哈希码相同,如果对象实现该接口,则它使用 Comparable compareTo 方法,否则标识哈希码。

如果从映射中删除条目,那么存储桶中条目的数量可能会减少,因此不再需要此树结构。这就是 UNTREEIFY_THRESHOLD = 6 的用途。如果一个桶中的元素数量低于6,我们不妨回到使用链表。



最后,还有 MIN_TREEIFY_CAPACITY = 64



当一个哈希映射的大小增大时,它会自动调整其自身的大小以便拥有更多的桶。如果我们有一个小的哈希映射,我们获得非常满的桶的可能性非常高,因为我们没有多少不同的桶来投入。拥有更大的哈希映射要好得多,而更多的桶不够满。这个常量基本上说,如果我们的哈希映射非常小,就不要开始制作树枝了 - 它应该调整大一些来代替。




要回答有关性能增益的问题,可以添加这些优化以改善最差情况。我只是猜测,但如果你的 hashCode 函数不是很好,你可能只会看到明显的性能改进,因为这些优化。






图片是我的(感谢MSPaint)。重新使用它们,无论你喜欢什么。


As per the following link document: Java HashMap Implementation

I'm confused with the implementation of HashMap (or rather, an enhancement in HashMap). My queries are:

Firstly

static final int TREEIFY_THRESHOLD = 8;
static final int UNTREEIFY_THRESHOLD = 6;
static final int MIN_TREEIFY_CAPACITY = 64;

Why and how are these constants used? I want some clear examples for this. How they are achieving a performance gain with this?

Secondly

If you see the source code of HashMap in JDK, you will find the following static inner class:

static final class TreeNode<K, V> extends java.util.LinkedHashMap.Entry<K, V> {
    HashMap.TreeNode<K, V> parent;
    HashMap.TreeNode<K, V> left;
    HashMap.TreeNode<K, V> right;
    HashMap.TreeNode<K, V> prev;
    boolean red;

    TreeNode(int arg0, K arg1, V arg2, HashMap.Node<K, V> arg3) {
        super(arg0, arg1, arg2, arg3);
    }

    final HashMap.TreeNode<K, V> root() {
        HashMap.TreeNode arg0 = this;

        while (true) {
            HashMap.TreeNode arg1 = arg0.parent;
            if (arg0.parent == null) {
                return arg0;
            }

            arg0 = arg1;
        }
    }
    //...
}

How is it used? I just want an explanation of the algorithm.

解决方案

HashMap contains a certain number of buckets. It uses hashCode to determine which bucket to put these into. For simplicity's sake imagine it as a modulus.

If our hashcode is 123456 and we have 4 buckets, 123456 % 4 = 0 so the item goes in the first bucket, Bucket 1.

If our hashcode function is good, it will provide an even distribution so all the buckets will be used somewhat equally. In this case, the bucket uses a linked list to store the values.

But you can't rely on people to implement good hash functions. People will often write poor hash functions which will result in a non-even distribution.

The less even this distribution is, the further we're moving from O(1) operations and the closer we're moving towards O(n) operations.

The implementation of Hashmap tries to mitigate this by organising some buckets into trees rather than linked lists if the buckets becomes too large. This is what TREEIFY_THRESHOLD = 8 is for. If a bucket contains more than eight items, it should become a tree.

This tree is a Red-Black tree. It is first sorted by hash code. If the hash codes are the same, it uses the compareTo method of Comparable if the objects implement that interface, else the identity hash code.

If entries are removed from the map, the number of entries in the bucket might reduce such that this tree structure is no longer necessary. That's what the UNTREEIFY_THRESHOLD = 6 is for. If the number of elements in a bucket drops below six, we might as well go back to using a linked list.

Finally, there is the MIN_TREEIFY_CAPACITY = 64.

When a hash map grows in size, it automatically resizes itself to have more buckets. If we have a small hash map, the likelihood of us getting very full buckets is quite high, because we don't that have many different buckets to put stuff into. It's much better to have a bigger hash map, with more buckets that are less full. This constant basically says not to start making buckets into trees if our hash map is very small - it should resize to be larger first instead.


To answer your question about the performance gain, these optimisations were added to improve the worst case. I'm only speculating but you would probably only see a noticeable performance improvement because of these optimisations if your hashCode function was not very good.


Images are mine (thanks MSPaint). Reuse them however you like.

这篇关于HashMap Java 8实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆