HashMap调整大小的方法实现细节 [英] HashMap resize method implementation detail
问题描述
正如标题所示,这是一个关于 HashMap#resize
的实现细节的问题 - 这就是内部数组的大小加倍的问题。
这有点罗嗦,但我真的试图证明我尽我所能理解这一点......
这发生在一个点这个特定的bucket / bin存储在一个 一般而言, 在这里有一个简单的模式 - 最后4位对于所有这些都是相同的 - 这意味着当我们插入8个这样的键时(总共有9个),它们将在同一个桶中结束;并且在第9个 因此,如果当前在 我们把第九个键。此时, 最终这段代码已经到达(当 实际上并不复杂......它做了什么,到将移动移动到其他垃圾箱和不会移动到其他垃圾箱的项目中 - 但仍然会保留在这个项目中。 实际上它很聪明,它是如何做到的 - 它通过这段代码: Linked 时尚中 - 因此具有精确的顺序,并且在这个问题的上下文中。 p>
resize
也可以从其他地方调用,但我们只看这种情况。假设你把这些字符串作为键放在 HashMap
中(右边有 hashcode
之后 HashMap#hash
- 这是内部重新哈希。)是的,这些是仔细生成的,而不是随机的。
DFHXR - 11111
YSXFJ - 01111
TUDDY - 11111
AXVUH - 01111
RUTWZ - 11111
DEDUC - 01111
WFCVW - 11111
ZETCU - 01111
GCVUR - 11111
HashMap#put
中,将调用 resize
。
HashMap
中有8个条目(具有上述其中一个键),则意味着该映射中有16个存储桶,它们的最后4位关键字决定了条目最终的位置。
TREEIFY_THRESHOLD
被命中,并调用 resize
。这些箱子加倍到 32
,并且从这些钥匙中再多出一点来决定入口的位置(现在是5位)。
resize
发生时):
节点< K,V> loHead = null,loTail = null;
节点< K,V> hiHead = null,hiTail = null;
节点< K,V>下一个;
do {
next = e.next; ((e.hash& oldCap)== 0){
if(loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if(hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e; ((e = next)!= null);
}
};
if(loTail!= null){
loTail.next = null;
newTab [j] = loHead;
}
if(hiTail!= null){
hiTail.next = null;
newTab [j + oldCap] = hiHead;
}
<(code> if((e.hash& oldCap)== 0)
这是否检查下一位(本例中是第5位)是否实际为零 - 如果是,则表示该条目将保留在原来的位置;如果不是,它将在新的垃圾箱中以两个偏移量的功率移动。
现在终于出现了这样的问题:调整大小中的那段代码是精心制作的,以便保留条目的顺序那个垃圾桶。
因此,在将这9个键放入 HashMap
后,顺序将为:
DFHXR - > TUDDY - > RUTWZ - > WFCVW - > GCVUR(一个bin)
YSXFJ - > AXVUH - > DEDUC - > ZETCU(另一个bin)
为什么要保留 HashMap中
。订单在 Map
真的坏,详细这里或 。
真的很糟糕[...]
这并不坏,这是(学术术语)。 Stuart Marks在您发布的第一个链接中写道:
[...]为未来的实施变更保留了灵活性[...] / p>
这意味着(据我了解)现在实现恰好保持顺序,但是在未来如果找到更好的实现,它会被用来保存订单。
As the title suggests this is a question about an implementation detail from HashMap#resize
- that's when the inner array is doubled in size.
It's a bit wordy, but I've really tried to prove that I did my best understanding this...
This happens at a point when entries in this particular bucket/bin are stored in a Linked
fashion - thus having an exact order and in the context of the question this is important.
Generally the resize
could be called from other places as well, but let's look at this case only.
Suppose you put these strings as keys in a HashMap
(on the right there's the hashcode
after HashMap#hash
- that's the internal re-hashing.) Yes, these are carefully generated, not random.
DFHXR - 11111
YSXFJ - 01111
TUDDY - 11111
AXVUH - 01111
RUTWZ - 11111
DEDUC - 01111
WFCVW - 11111
ZETCU - 01111
GCVUR - 11111
There's a simple pattern to notice here - the last 4 bits are the same for all of them - which means that when we insert 8 of these keys (there are 9 total), they will end-up in the same bucket; and on the 9-th HashMap#put
, the resize
will be called.
So if currently there are 8 entries (having one of the keys above) in the HashMap
- it means there are 16 buckets in this map and the last 4 bits of they key decided where the entries end-up.
We put the nine-th key. At this point TREEIFY_THRESHOLD
is hit and resize
is called. The bins are doubled to 32
and one more bit from the keys decides where that entry will go (so, 5 bits now).
Ultimately this piece of code is reached (when resize
happens):
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
It's actually not that complicated... what it does it splits the current bin into entries that will move to other bins and to entries that will not move to other bins - but will remain into this one for sure.
And it's actually pretty smart how it does that - it's via this piece of code:
if ((e.hash & oldCap) == 0)
What this does is check if the next bit (the 5-th in our case) is actually zero - if it is, it means that this entry will stay where it is; if it's not it will move with a power of two offset in the new bin.
And now finally the question: that piece of code in the resize is carefully made so that it preserves the order of the entries there was in that bin.
So after you put those 9 keys in the HashMap
the order is going to be :
DFHXR -> TUDDY -> RUTWZ -> WFCVW -> GCVUR (one bin)
YSXFJ -> AXVUH -> DEDUC -> ZETCU (another bin)
Why would you want to preserve order of some entries in the HashMap
. Order in a Map
is really bad as detailed here or here.
Order in a Map is really bad [...]
It's not bad, it's (in academic terminology) whatever. What Stuart Marks wrote at the first link you posted:
[...] preserve flexibility for future implementation changes [...]
Which means (as I understand it) that now the implementation happens to keep the order, but in the future if a better implementation is found, it will be used either it keeps the order or not.
这篇关于HashMap调整大小的方法实现细节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!