为什么在搜索键时,哈希表查找只需要O(1)时间,即O(n)? [英] Why is a hash table lookup only O(1) time when searching for a key is O(n)?

查看:326
本文介绍了为什么在搜索键时,哈希表查找只需要O(1)时间,即O(n)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从技术上讲,根据我在这里阅读的帖子,在最坏的情况下,哈希表的确是O(n)时间查询.但是我不知道内部机制如何保证平均时间为O(1).

Technically speaking, based on posts I've read here, Hash table is indeed O(n) time lookup in the worst case. But I don't get how the internal mechanics guarantee it to be O(1) time on average.

我的理解是,给定一些n元素,理想情况是存在n个存储桶,这会导致O(1)空间.这就是我卡住的地方.假设我要查找字典中是否有键,这肯定要花O(n)的时间.那么,当我想通过使用其键的哈希值搜索元素是否在哈希表中时,为什么它会有所不同呢?简而言之,使用原始键值进行搜索的时间为O(n),而使用哈希值进行搜索的时间为O(1).这是为什么?

My understanding is that given some n elements, the ideal case is there are n buckets, which result in O(1) space. This is where I'm stuck. Suppose I want to lookup whether a key is in the dictionary, this definitely takes me O(n) time. So why does it make a difference when I want to search whether an element is in the hash table by using its key's hash value? To put it concisely, searching using raw key values gives O(n) time but using hash values it's O(1) time. Why is that?

我是否仍然需要一个一个地查找哈希值以查看哪个匹配?为什么散列让我立即知道要检索哪个元素,或者该元素是否存在?

Wouldn't I still need to look up the hash values one by one to see which one matches? Why does the hashing let me know immediately which element to retrieve or whether such an element exists or not?

推荐答案

我认为您在混淆术语的同时,还会因考虑存储桶而使事情复杂化.

I think you are confusing terminology and also complicating matters by thinking about buckets.

让我们想象一个哈希表,它以长度为n的数组a的形式实现.我们还假设我们有n个可能的键和一个完美的哈希函数H,该函数将每个键k映射到a中的唯一索引i.

Let's imagine a hash table that is implemented in as an array a of length n. Let's also imagine we have n possible keys and a perfect hash function H that maps each key k to a unique index i in a.

让我们通过将a中的每个值设置为nil来初始化哈希表.

Let's initialize our hash table by setting every value in a to nil.

我们可以将键值对(k1, v1)插入哈希表,方法是将值放置在数组中的适当位置:

We can insert a key, value pair (k1, v1) into our hash table by placing the value at the appropriate position in the array:

a[H(k1)] = v1

现在让我们说,稍后我们忘记了k1是否在哈希表中,我们想检查它是否在其中.为此,我们只需查找a[H(k1)]并查看是否存在任何值,即a[H(k1)] != nil.显然,这是一个恒定的时间查找.

Now let's say that later we forgot if k1 is in the hash table and we want to check if it's there. To do this we simply look up a[H(k1)] and see if any value is there, i.e. a[H(k1)] != nil. This is clearly a constant time lookup.

但是,如果我们想查看v1甚至其他v2是否在哈希表中的任何地方,该怎么办?这不是那么容易,因为我们没有将vi映射到数组中某个位置的函数.它可以与任何键关联.因此,查看表中是否存在该表的唯一方法是扫描整个数组,检查每个值:

But what if we want to see if v1, or even some other v2 is anywhere in our hashtable? This is not so easy because we have no function that maps a vi to a position in our array. It could be associated with any key. So the only way to see if it exists in the table is to scan the entire array, checking every value:

for i in 0..n-1:
  if a[i] == v2:
    return true
return false

为了更具体一点,请想象您的键是名称,而值是居住城市.现在比较询问"Bob Jones是否在哈希表中?"到哈希表中有纽约人吗?".我们可以对鲍勃·琼斯"进行哈希处理,以查看相应数组位置中是否有任何内容(因为这样会插入鲍勃·琼斯"),但是我们没有类似的快速方法来查找纽约".

To make this a bit more concrete, imagine your keys are names, and your values are cities of residence. Now compare asking "Is Bob Jones in the hash table?" to "Is there anyone from New York in the hash table?". We can hash "Bob Jones" and see if there's anything in the corresponding array position (because that's how "Bob Jones" would have been inserted), but we have no similarly quick way to look up "New York".

我假设这就是您要的内容,并且您对术语有些困惑.如果这不是您想要的,请发表评论.

I am assuming this is what you are asking, and you have confused the terminology a bit. Please comment if this is not what you wanted.

这篇关于为什么在搜索键时,哈希表查找只需要O(1)时间,即O(n)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆