链式哈希表与开放式地址哈希表 [英] Chained Hash Tables vs. Open-Addressed Hash Tables

查看:46
本文介绍了链式哈希表与开放式地址哈希表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人能解释一下这两种实现之间(优点/缺点)的主要区别吗?

Can somebody explain the main differences between (advantages / disadvantages) the two implementations?

对于图书馆,推荐什么实现?

For a library, what implementation is recommended?

推荐答案

维基百科关于哈希表的文章 对人们使用的不同哈希表方案给出了明显更好的解释和概述,而我无法想象.事实上,阅读那篇文章可能比在这里提问要好.:)

Wikipedia's article on hash tables gives a distinctly better explanation and overview of different hash table schemes that people have used than I'm able to off the top of my head. In fact you're probably better off reading that article than asking the question here. :)

说...

链式哈希表索引到指向链表头部的指针数组.每个链表单元格都有为其分配的键和为该键插入的值.当您想从其键中查找特定元素时,键的散列用于确定要跟随哪个链表,然后遍历该特定列表以查找您要查找的元素.如果哈希表中有多个键具有相同的哈希值,那么您将拥有包含多个元素的链表.

A chained hash table indexes into an array of pointers to the heads of linked lists. Each linked list cell has the key for which it was allocated and the value which was inserted for that key. When you want to look up a particular element from its key, the key's hash is used to work out which linked list to follow, and then that particular list is traversed to find the element that you're after. If more than one key in the hash table has the same hash, then you'll have linked lists with more than one element.

链式散列的缺点是必须遵循指针来搜索链表.好处是链式哈希表只会随着负载因子(哈希表中元素与桶数组长度的比率)的增加而线性变慢,即使它上升到 1 以上.

The downside of chained hashing is having to follow pointers in order to search linked lists. The upside is that chained hash tables only get linearly slower as the load factor (the ratio of elements in the hash table to the length of the bucket array) increases, even if it rises above 1.

一个开放寻址的哈希表索引到一个指向(键,值)对的指针数组.您可以使用键的散列值来确定首先查看数组中的哪个槽.如果哈希表中的多个键具有相同的哈希值,那么您可以使用某种方案来决定另一个要查找的槽.例如,线性探测是您查看所选的下一个插槽之后的下一个插槽,然后再查看下一个插槽,依此类推,直到您找到与您正在寻找的键匹配的插槽,或者您击中了一个空位插槽(在这种情况下,密钥不能在那里).

An open-addressing hash table indexes into an array of pointers to pairs of (key, value). You use the key's hash value to work out which slot in the array to look at first. If more than one key in the hash table has the same hash, then you use some scheme to decide on another slot to look in instead. For example, linear probing is where you look at the next slot after the one chosen, and then the next slot after that, and so on until you either find a slot that matches the key you're looking for, or you hit an empty slot (in which case the key must not be there).

当负载因子较低时,开放寻址通常比链式散列更快,因为您不必遵循列表节点之间的指针.如果负载因子接近 1,它会变得非常非常慢,因为在找到要查找的键或空槽之前,您通常必须搜索存储桶数组中的许多槽.此外,哈希表中的元素永远不会多于存储桶数组中的条目.

Open-addressing is usually faster than chained hashing when the load factor is low because you don't have to follow pointers between list nodes. It gets very, very slow if the load factor approaches 1, because you end up usually having to search through many of the slots in the bucket array before you find either the key that you were looking for or an empty slot. Also, you can never have more elements in the hash table than there are entries in the bucket array.

为了处理当负载因子接近 1 时所有哈希表至少会变慢(并且在某些情况下实际上完全破坏)这一事实,实际的哈希表实现使存储桶数组更大(通过分配一个新的存储桶数组,并且当负载因子超过某个值(通常约为 0.7)时,将旧元素复制到新元素中,然后释放旧元素).

To deal with the fact that all hash tables at least get slower (and in some cases actually break completely) when their load factor approaches 1, practical hash table implementations make the bucket array larger (by allocating a new bucket array, and copying elements from the old one into the new one, then freeing the old one) when the load factor gets above a certain value (typically about 0.7).

以上所有内容都有很多变化.再次,请参阅维基百科文章,它确实非常好.

There are lots of variations on all of the above. Again, please see the wikipedia article, it really is quite good.

对于供其他人使用的库,我强烈建议进行试验.由于它们通常对性能非常重要,因此您通常最好使用其他人已经仔细调整过的哈希表的实现.有很多开源的 BSD、LGPL 和 GPL 许可的哈希表实现.

For a library that is meant to be used by other people, I would strongly recommend experimenting. Since they're generally quite performance-crucial, you're usually best off using somebody else's implementation of a hash table which has already been carefully tuned. There are lots of open-source BSD, LGPL and GPL licensed hash table implementations.

例如,如果您正在使用 GTK,那么您会发现 GLib 中的哈希表.

If you're working with GTK, for example, then you'll find that there's a good hash table in GLib.

这篇关于链式哈希表与开放式地址哈希表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆