HashTable和HashMap键值如何存储在内存中? [英] How HashTable and HashMap key-value are stored in the memory?
问题描述
但我不明白碰撞是如何发生的在这里? Java使用哪种哈希算法来创建内存空间?它是MD5吗?
HashMap
的基本思想是:
- 一个
HashMap
实际上是一个包含Key和Value的特殊对象数组。 - 数组有一定数量的桶(槽),比如16个。
- 散列算法由
hashCode()
方法,每个对象都有。因此,在编写新的Class
时,应该注意正确的hashCode()
和equals()
方法的实现。默认的(Object
class)将内存指针作为一个数字。但是对于我们想要使用的大多数课程来说这并不好。例如,String
类使用一种算法来从字符串中的所有字符生成散列 - 将它想象为:hashCode = 1。 char + 2.char + 3.char ...
(简体)。因此,即使它们位于内存中的不同位置,两个相等的字符串也具有相同的hashCode()
。 - 如果我们有一个很大的数组,那么
hashCode()
的结果(比如说132)就是应该存储对象的桶的数量。但我们没有,我们的只有16桶。所以我们使用明显的计算'hashcode%array.length = bucket'
或'132 mod 16 = 4'
和将Key-Value对存储在第4个存储桶中。 -
- 如果还没有其他配对,则可以。
- 如果有Key与我们拥有的Key相同,那么我们删除旧的。 如果还有另一个Key-Value对(碰撞),我们将新的如果后备数组变得太满了,那么我们必须制作太多链接列表,我们创建一个长度加倍的新数组,重新组合所有元素并将它们添加到新数组中,然后处理旧数组。这很可能是
HashMap
中最昂贵的操作,所以你想告诉你的Maps
你会有多少桶如果有人试图获得某个值,那么他会提供一个密钥,我们对其进行哈希,对其进行修改,然后仔细检查潜在的链接列表匹配。
- 如果有Key与我们拥有的Key相同,那么我们删除旧的。 如果还有另一个Key-Value对(碰撞),我们将新的如果后备数组变得太满了,那么我们必须制作太多链接列表,我们创建一个长度加倍的新数组,重新组合所有元素并将它们添加到新数组中,然后处理旧数组。这很可能是
- 如果还没有其他配对,则可以。
在这种情况下,
- 有一个256个桶的数组(当然编号为0 -255)
- 有五个人。它们的哈希码在经过
mod 256
之后指向数组中的四个不同的槽。 - 你可以看到Sandra Dee didn'没有空闲插槽,所以她被约翰史密斯锁住了。
现在,如果您试图查找电话号码桑德拉迪,你可以把她的名字拼凑成256,然后看看桶152.在那里你会找到约翰史密斯。那不是桑德拉,看得更远......哈哈,桑德拉跟约翰一起被束缚着。
I understand there is a hashing technique is applied to a key to store its value in the memory address.
But I don't understand how the collision is happening here? Which hashing algorithm does Java use to create a memory space? Is it MD5?
The basic idea of HashMap
is this:
- A
HashMap
is really an array of special objects that hold both Key and Value. - The array has some amount of buckets (slots), say 16.
- The hashing algorithm is provided by the
hashCode()
method that every object has. Therefore, when you are writing a newClass
, you should take care of properhashCode()
andequals()
methods implementation. The default one (of theObject
class) takes the memory pointer as a number. But that's not good for most of the classes we would like to use. For example, theString
class uses an algorithm that makes the hash from all the characters in the string - think of it like this:hashCode = 1.char + 2.char + 3.char...
(simplified). Therefore, two equal Strings, even though they are on a different location in the memory, have the samehashCode()
. - The result of
hashCode()
, say "132", is then the number of bucket where the object should be stored if we had an array that big. But we don't, ours is only 16 buckets long. So we use the obvious calculation'hashcode % array.length = bucket'
or'132 mod 16 = 4'
and store the Key-Value pair in a bucket number 4. - If there's no other pair yet, it's ok.
- If there's one with Key equal to the Key we have, we remove the old one.
- If there's another Key-Value pair (collision), we chain the new one after the old one into a linked list.
- If the backing array becomes too full, so we would have to make too many linked lists, we make a new array with doubled length, rehash all the elements and add them into the new array, then we dispose the old one. This is most likely the most expensive operation on
HashMap
, so you want to tell yourMaps
how many buckets you'll use if you know it in before. - If somebody tries to get a value, he provides a key, we hash it, mod it, and then go through the potential linked list for the exact match.
An image, courtesy of Wikipedia:
In this case,
- there is an array with 256 buckets (numbered, of course, 0-255)
- there are five people. Their hashcodes, after being put through
mod 256
, point to four different slots in the array. - you can see Sandra Dee didn't have a free slot, so she has been chained after John Smith.
Now, if you tried to look up a telephone number of Sandra Dee, you would hash her name, mod it by 256, and look into the bucket 152. There you'd find John Smith. That's no Sandra, look further ... aha, there's Sandra chained after John.
这篇关于HashTable和HashMap键值如何存储在内存中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!