哈希表运行时复杂度(插入、搜索和删除) [英] Hash table runtime complexity (insert, search and delete)

查看:24
本文介绍了哈希表运行时复杂度(插入、搜索和删除)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么我总是在哈希表上看到这些函数的不同运行时复杂性?

Why do I keep seeing different runtime complexities for these functions on a hash table?

在 wiki 上,搜索和删除是 O(n)(我认为哈希表的重点是不断查找,所以如果搜索是 O(n),那又有什么意义呢).

On wiki, search and delete are O(n) (I thought the point of hash tables was to have constant lookup so what's the point if search is O(n)).

在不久前的一些课程笔记中,我看到各种复杂性取决于某些细节,包括所有 O(1) 的复杂性.如果我可以得到所有的 O(1),为什么还要使用其他实现?

In some course notes from a while ago, I see a wide range of complexities depending on certain details including one with all O(1). Why would any other implementation be used if I can get all O(1)?

如果我在 C++ 或 Java 等语言中使用标准哈希表,我可以期望时间复杂度是多少?

If I'm using standard hash tables in a language like C++ or Java, what can I expect the time complexity to be?

推荐答案

哈希表 O(1) 平均和摊销案例复杂度,但是它遭受O(n) 最坏情况 时间复杂度.[我认为这就是你的困惑]

Hash tables are O(1) average and amortized case complexity, however it suffers from O(n) worst case time complexity. [And I think this is where your confusion is]

哈希表遭受O(n)最坏的时间复杂度,原因有两个:

Hash tables suffer from O(n) worst time complexity due to two reasons:

  1. 如果太多元素被散列到同一个键中:查看这个键可能需要 O(n) 时间.
  2. 一旦哈希表通过其负载平衡 - 它必须重新哈希[创建一个新的更大的表,并将每个元素重新插入表中].
  1. If too many elements were hashed into the same key: looking inside this key may take O(n) time.
  2. Once a hash table has passed its load balance - it has to rehash [create a new bigger table, and re-insert each element to the table].

然而,据说它是O(1)平均和摊销的情况,因为:

However, it is said to be O(1) average and amortized case because:

  1. 很多项目会被散列到同一个键是非常罕见的[如果你选择了一个好的散列函数并且你没有太大的负载平衡.
  2. rehash 操作,也就是 O(n),最多可以在 n/2 个操作之后发生,这些操作都是假定的 O(1):因此,当您对每个操作的平均时间求和时,您会得到:(n*O(1) + O(n))/n) = O(1)
  1. It is very rare that many items will be hashed to the same key [if you chose a good hash function and you don't have too big load balance.
  2. The rehash operation, which is O(n), can at most happen after n/2 ops, which are all assumed O(1): Thus when you sum the average time per op, you get : (n*O(1) + O(n)) / n) = O(1)

请注意,由于重新散列问题 - 实时应用程序和需要低延迟的应用程序 - 不应该使用哈希表作为他们的数据结构.

Note because of the rehashing issue - a realtime applications and applications that need low latency - should not use a hash table as their data structure.

哈希表的另一个问题:缓存
另一个您可能会看到大型哈希表性能下降的问题是由于缓存性能.哈希表的缓存性能不佳,因此对于大型集合 - 访问时间可能会更长,因为您需要将表的相关部分从内存重新加载回缓存.

Annother issue with hash tables: cache
Another issue where you might see a performance loss in large hash tables is due to cache performance. Hash Tables suffer from bad cache performance, and thus for large collection - the access time might take longer, since you need to reload the relevant part of the table from the memory back into the cache.

这篇关于哈希表运行时复杂度(插入、搜索和删除)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆