HashMap调整大小的方法实现细节 [英] HashMap resize method implementation detail

查看:138
本文介绍了HashMap调整大小的方法实现细节的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如标题所示,这是一个关于 HashMap#resize 的实现细节的问题 - 这就是内部数组的大小加倍的问题。
这有点罗嗦,但我真的试图证明我尽我所能理解这一点......



这发生在一个点这个特定的bucket / bin存储在一个 Linked 时尚中 - 因此具有精确的顺序,并且在这个问题的上下文中。 p>

一般而言, resize 也可以从其他地方调用,但我们只看这种情况。假设你把这些字符串作为键放在 HashMap 中(右边有 hashcode 之后 HashMap#hash - 这是内部重新哈希。)是的,这些是仔细生成的,而不是随机的。

  DFHXR  -  11111 
YSXFJ - 01111
TUDDY - 11111
AXVUH - 01111
RUTWZ - 11111
DEDUC - 01111
WFCVW - 11111
ZETCU - 01111
GCVUR - 11111

在这里有一个简单的模式 - 最后4位对于所有这些都是相同的 - 这意味着当我们插入8个这样的键时(总共有9个),它们将在同一个桶中结束;并且在第9个 HashMap#put 中,将调用 resize



因此,如果当前在 HashMap 中有8个条目(具有上述其中一个键),则意味着该映射中有16个存储桶,它们的最后4位关键字决定了条目最终的位置。



我们把第九个键。此时, TREEIFY_THRESHOLD 被命中,并调用 resize 。这些箱子加倍到 32 ,并且从这些钥匙中再多出一点来决定入口的位置(现在是5位)。



最终这段代码已经到达(当 resize 发生时):

 节点< K,V> loHead = null,loTail = null; 
节点< K,V> hiHead = null,hiTail = null;
节点< K,V>下一个;
do {
next = e.next; ((e.hash& oldCap)== 0){
if(loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if(hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e; ((e = next)!= null);
}
};



if(loTail!= null){
loTail.next = null;
newTab [j] = loHead;
}
if(hiTail!= null){
hiTail.next = null;
newTab [j + oldCap] = hiHead;
}

实际上并不复杂......它做了什么,到将移动移动到其他垃圾箱和不会移动到其他垃圾箱的项目中 - 但仍然会保留在这个项目中。



实际上它很聪明,它是如何做到的 - 它通过这段代码:

 <(code> if((e.hash& oldCap)== 0)

这是否检查下一位(本例中是第5位)是否实际为零 - 如果是,则表示该条目将保留在原来的位置;如果不是,它将在新的垃圾箱中以两个偏移量的功率移动。



现在终于出现了这样的问题:调整大小中的那段代码是精心制作的,以便保留条目的顺序那个垃圾桶。

因此,在将这9个键放入 HashMap 后,顺序将为:

  DFHXR  - > TUDDY  - > RUTWZ  - > WFCVW  - > GCVUR(一个bin)

YSXFJ - > AXVUH - > DEDUC - > ZETCU(另一个bin)

为什么要保留 HashMap中。订单在 Map 真的坏,详细这里

解决方案


真的很糟糕[...]

这并不坏,这是(学术术语)。 Stuart Marks在您发布的第一个链接中写道:


[...]为未来的实施变更保留了灵活性[...] / p>

这意味着(据我了解)现在实现恰好保持顺序,但是在未来如果找到更好的实现,它会被用来保存订单。


As the title suggests this is a question about an implementation detail from HashMap#resize - that's when the inner array is doubled in size. It's a bit wordy, but I've really tried to prove that I did my best understanding this...

This happens at a point when entries in this particular bucket/bin are stored in a Linked fashion - thus having an exact order and in the context of the question this is important.

Generally the resize could be called from other places as well, but let's look at this case only.

Suppose you put these strings as keys in a HashMap (on the right there's the hashcode after HashMap#hash - that's the internal re-hashing.) Yes, these are carefully generated, not random.

 DFHXR - 11111
 YSXFJ - 01111 
 TUDDY - 11111 
 AXVUH - 01111 
 RUTWZ - 11111
 DEDUC - 01111
 WFCVW - 11111
 ZETCU - 01111
 GCVUR - 11111 

There's a simple pattern to notice here - the last 4 bits are the same for all of them - which means that when we insert 8 of these keys (there are 9 total), they will end-up in the same bucket; and on the 9-th HashMap#put, the resize will be called.

So if currently there are 8 entries (having one of the keys above) in the HashMap - it means there are 16 buckets in this map and the last 4 bits of they key decided where the entries end-up.

We put the nine-th key. At this point TREEIFY_THRESHOLD is hit and resize is called. The bins are doubled to 32 and one more bit from the keys decides where that entry will go (so, 5 bits now).

Ultimately this piece of code is reached (when resize happens):

 Node<K,V> loHead = null, loTail = null;
 Node<K,V> hiHead = null, hiTail = null;
 Node<K,V> next;
 do {
     next = e.next;
     if ((e.hash & oldCap) == 0) {
          if (loTail == null)
               loHead = e;
          else
               loTail.next = e;
          loTail = e;
     }
     else {
        if (hiTail == null)
            hiHead = e;
        else
            hiTail.next = e;
        hiTail = e;
     }
 } while ((e = next) != null);



 if (loTail != null) {
     loTail.next = null;
     newTab[j] = loHead;
 }
 if (hiTail != null) {
     hiTail.next = null;
     newTab[j + oldCap] = hiHead;
 }

It's actually not that complicated... what it does it splits the current bin into entries that will move to other bins and to entries that will not move to other bins - but will remain into this one for sure.

And it's actually pretty smart how it does that - it's via this piece of code:

 if ((e.hash & oldCap) == 0) 

What this does is check if the next bit (the 5-th in our case) is actually zero - if it is, it means that this entry will stay where it is; if it's not it will move with a power of two offset in the new bin.

And now finally the question: that piece of code in the resize is carefully made so that it preserves the order of the entries there was in that bin.

So after you put those 9 keys in the HashMap the order is going to be :

DFHXR -> TUDDY -> RUTWZ -> WFCVW -> GCVUR (one bin)

YSXFJ -> AXVUH -> DEDUC -> ZETCU (another bin)

Why would you want to preserve order of some entries in the HashMap. Order in a Map is really bad as detailed here or here.

解决方案

Order in a Map is really bad [...]

It's not bad, it's (in academic terminology) whatever. What Stuart Marks wrote at the first link you posted:

[...] preserve flexibility for future implementation changes [...]

Which means (as I understand it) that now the implementation happens to keep the order, but in the future if a better implementation is found, it will be used either it keeps the order or not.

这篇关于HashMap调整大小的方法实现细节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆