哈希如何具有o(1)搜索时间? [英] How does hashing have an o(1) search time?

查看:130
本文介绍了哈希如何具有o(1)搜索时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我们使用 HashTable 来存储数据时,据说搜索需要o(1)次。我很困惑,任何人都可以解释一下吗?

解决方案

它可能需要更长的时间,但通常不会。



基本上,哈希表是包含所有要搜索的键的数组。数组中每个键的位置由散列函数确定,该函数可以是任何总是将相同输入映射到相同输出的函数。我们假设哈希函数是O(1)。因此,当我们在哈希表中插入一些东西时,我们使用哈希函数(我们称之为 h )找到放置的位置,并放在那里。现在插入另一个东西,哈希和存储。另一个。每次插入数据时,都需要O(1)次插入它(因为散列函数是O(1)。)

查找数据是相同的。我们希望找到一个值, x ,我们只需找出 h (x),它告诉我们 x 位于散列表,所以我们可以在O(1)中查找任何散列值。

现在说谎:上面是一个非常好的理论,有一个问题:如果我们插入数据并且数组中已经存在某些东西,那该怎么办?没有什么能保证哈希函数不会为两个不同的输入产生相同的输出(除非你有一个完美哈希函数,但这些都很难产生)。因此,当我们插入时,我们需要采取以下两种策略之一:




  • 在数组中的每个点上存储多个值(比如说,每个数组槽都有一个链表)现在,当您执行查找时,它仍然是O(1)到达c或者在数组中放置正确的位置,但可能会在一个(希望是短的)链表中进行线性搜索。这被称为单独链接。

  • 如果您发现已经存在的内容,请再次散列并查找其他位置。重复,直到你找到一个空的地方,并把它放在那里。查找过程可以遵循相同的规则来查找数据。现在它仍然是O(1)到达第一个位置,但是有一个可能的(希望是短的)线性搜索在桌子上反弹,直到你找到你之后的数据。这就是所谓的开放寻址。



基本上,两种方法仍然主要是O(1),但以一个希望的短线性序列。我们可以为大多数目的假设它是O(1)。如果散列表变得太满,那些线性搜索可能变得越来越长,然后是重新散列的时候,这意味着要创建一个更大尺寸的新散列表并将所有数据重新插入到它中。


When we use a HashTable for storing data, it is said that searching takes o(1) time. I am confused, can anybody explain?

解决方案

Well it's a little bit of a lie -- it can take longer than that, but it usually doesn't.

Basically, a hash table is an array containing all of the keys to search on. The position of each key in the array is determined by the hash function, which can be any function which always maps the same input to the same output. We shall assume that the hash function is O(1).

So when we insert something into the hash table, we use the hash function (let's call it h) to find the location where to put it, and put it there. Now we insert another thing, hashing and storing. And another. Each time we insert data, it takes O(1) time to insert it (since the hash function is O(1).

Looking up data is the same. If we want to find a value, x, we have only to find out h(x), which tells us where x is located in the hash table. So we can look up any hash value in O(1) as well.

Now to the lie: The above is a very nice theory with one problem: what if we insert data and there is already something in that position of the array? There is nothing which guarantees that the hash function won't produce the same output for two different inputs (unless you have a perfect hash function, but those are tricky to produce). Therefore, when we insert we need to take one of two strategies:

  • Store multiple values at each spot in the array (say, each array slot has a linked list). Now when you do a lookup, it is still O(1) to arrive at the correct place in the array, but potentially a linear search down a (hopefully short) linked list. This is called "separate chaining".
  • If you find something is already there, hash again and find another location. Repeat until you find an empty spot, and put it there. The lookup procedure can follow the same rules to find the data. Now it's still O(1) to get to the first location, but there is a potentially (hopefully short) linear search to bounce around the table till you find the data you are after. This is called "open addressing".

Basically, both approaches are still mostly O(1) but with a hopefully-short linear sequence. We can assume for most purposes that it is O(1). If the hash table is getting too full, those linear searches can become longer and longer, and then it is time to "re-hash" which means to create a new hash table of a much bigger size and insert all the data back into it.

这篇关于哈希如何具有o(1)搜索时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆