带有逻辑删除的哈希表的加载因子 [英] Load factor of hash tables with tombstones

查看:120
本文介绍了带有逻辑删除的哈希表的加载因子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此出现了一个问题,即在计算哈希表的负载因子时是否应包括墓碑.

So the question came up about whether tombstones should be included when calculating the load factor of a hash table.

我认为,鉴于使用负载系数来确定何时扩展容量,因此不应该包括墓碑.一个明显的例子是,您几乎要填充然后删除哈希表中的每个值.在这里插入非常容易(没有冲突),因此我认为负载因子不应该包括它们.

I thought that, given that the load factor is used to determine when to expand capacity, tombstones should not be included. An obvious example is if you almost fill and then remove every value in a hash table. Here insertions are super easy (no collisions) so I believe the load factor shouldn't include them.

但是您可以查看一下,并认为对于所有墓碑,查找将很慢(可能会搜索几乎整个空间).

But you could look at this and think that with all the tombstones lookups will be slow (potentially searching almost the entire space).

所以我想问这个问题.哈希表的负载因子应在计算中包括墓碑吗?

So I thought I'd ask the question. Should the load factor of a hashtable include tombstones in the calculation?

推荐答案

负载因子不是哈希表数据结构的必要组成部分-它是为动态系统定义行为规则的方法(增长/收缩哈希表是一个动态系统).

Load factor is not an essential part of hash table data structure -- it is the way to define rules of behaviour for the dymamic system (growing/shrinking hash table is a dynamic system).

此外,在我看来,在95%的现代哈希表情况下,这种方法过于简化,动态系统的运行性能却欠佳.优点:

Moreover, in my opinion, in 95% of modern hash table cases this way is over simplified, dynamic systems behave suboptimally. It's advantages:

  • 那么,理解和实施的简单性.
  • 哈希表数据结构不应存储带有阈值的许多数字-可能只有一个数字.当哈希表非常小且标头的大小影响整个数据结构的内存效率(以字节为单位存储条目)时,这是有意义的.
  • 在某些(常见)情况下:仅追加/更新哈希表,更复杂的行为模型退化为"just load factor"模型,换句话说,load factor模型定义了相对最佳的行为.

另请参阅我对负载系数模型的回答.我更喜欢 [最小负载,目标负载,最大负载] +生长因子框架模型.


如果您使用墓碑开发通用哈希表,我认为您可以拿起我的结果(如下).我可能花了数周的时间专门开发此模型.也许您可以进行一些改进或进一步研究,我会很高兴.

See also my answer on load factor model. I prefer [min load, target load, max load] + growth factor frame model.


If you develop general-purpose hash table with tombstones, I think you can just pick up my results (below). I spend maybe several weeks solely developing this model. Maybe you can make some improvements or further research, I would be glad.

有两个主要的哈希表动态行为模式作为目标:

Two main hash table dynamic behaviour patterns are targeted:

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆