链式散列的最坏情况运行时间是什么? [英] What's the worst case running time of Hashing with Chaining?

查看:225
本文介绍了链式散列的最坏情况运行时间是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设哈希表槽的数量(例如n)与表中元素的数量(例如m)成正比。我们有n = O(m),
负载因子l = O(m)/ m = O(1)
因此,在简单均匀散列的假设下,搜索需要平均时间恒定。这意味着平均而言,搜索所花费的时间与链表的长度成比例,这对所有时隙都是相同的,因此是恒定时间。但是,在简单统一哈希的假设下,最坏情况下的运行时间呢?它也是常数还是O(1 + l)。请说明我很困惑。 [参考CLRS第260页]

Suppose the number of hash table slots(say n) are proportional to the number of elements in the table(say m). We have n = O(m), load factor l = O(m)/m = O(1) So Under the assumption of Simple Uniform Hashing, Searching takes constant time on an average. Which means on an average searching takes time proportional to the length of the linked list which is same for all slots and hence constant time. But what about the worst case running time under the assumption of Simple Uniform Hashing. Is it also be constant or it'll O(1 + l). Please explain I'm confused. [Reference CLRS Page 260]

在简单统一哈希的假设下,不成功搜索的最坏情况时间与平均情况时间相同。在简单统一哈希的假设下,成功搜索的最坏情况时间将不同于平均情况下的时间。

Does worst case time for Un-successful Search under the assumption of Simple uniform hashing will be same as average case time. And worst case time for successful Search under the assumption of Simple uniform hashing will be different than average case time.

推荐答案

在简单统一哈希的假设下(即,假设的哈希函数将均匀地分配项目到哈希表的插槽中),我认为查找操作的最坏情况性能与平均情况(查找失败)相同-Θ(n / m +1)(根据 Wikipedia 的平均情况)。

Under the assumption of Simple Uniform Hashing (i.e. that a hypothetical hashing function will evenly distribute items into the slots of a hash table), I believe the worst-case performance for a lookup operation would the same as the average-case (for an unsuccessful lookup) - Θ(n/m + 1) (average case as per Wikipedia).

为什么?那么,请考虑一下,在上述假设下,表中的每个广告位在其链中将具有相同数量的元素。因此,平均情况和最坏情况都会涉及到对链条中所有元素的仔细研究。

Why? Well, consider that, under the above assumption, each slot in the table will have the same number of elements in its chain. Because of this, both the average case and the worst case will involve looking through all the elements in any of the chains.

当然,这是一个非常乐观的假设-它的实践是,我们很少/永远不会预先确定一个哈希函数,该哈希函数会均匀地分布一些未知数据集(而且我们很少专门为数据集构建哈希函数),但是与此同时,我们不太可能真正实现

This is, of course, a pretty optimistic assumption - it practice we can rarely / never predetermine a hash function which will evenly distribute some unknown set of data (and we rarely build hash functions specifically for data sets), but, at the same time, we're unlikely to get to the true worst-case.

通常,使用链接对哈希表进行查找或删除操作的最坏情况运行时间是Θ(n)

In general, the worst-case running time of a lookup or remove operation for a hash table using chaining is Θ(n).

在两种情况下,插入仍然可以实现为Θ(1 ),因为您可以只插入链的最前面。也就是说,如果我们允许重复(如提到的 Jim ),因为如果不允许的话,我们首先必须检查是否

In both cases, insert can still be implemented as Θ(1), since you can just insert at the front of the chain. That is, if we allow duplicates (as Jim mentioned), because, if not, we first have to check if it's already there (i.e. do a lookup).

最糟糕的情况是,所有元素都散列为相同的值,因此实际上只有一条很长的链将您的数据结构转换为链接列表。

The worst case happens when all the elements hash to the same value, thus you'd have one really long chain, essentially turning your data structure into a linked-list.

|--------|
|element1| -> element2 -> element3 -> element4 -> element5
|--------|
|  null  |
|--------|
|  null  |
|--------|
|  null  |
|--------|
|  null  |
|--------|

这篇关于链式散列的最坏情况运行时间是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆