在HashMap中为什么阈值(下一个要调整大小的大小值)是容量*负载因子。为什么不等于地图的大小或容量 [英] In HashMap why threshold value (The next size value at which to resize) is capacity * load factor. Why not as equal to size or capacity of map

查看:1294
本文介绍了在HashMap中为什么阈值(下一个要调整大小的大小值)是容量*负载因子。为什么不等于地图的大小或容量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在HashMap中,为什么阈值(要调整大小的下一个大小值)为 capacity * load 因子。为什么不等于大小地图容量

In HashMap why threshold value (The next size value at which to resize) is capacity * load factor. Why not as equal to size or capacity of map.

例如,最初默认容量= 16,负载因子= 0.75,因此 threshold =(capacity * load factor)=(16 * 0.75) 12

For example initially default capacity = 16 , load factor = 0.75 and hence threshold = (capacity * load factor) = (16 * 0.75) = 12.

当我们添加第13个元素时,地图调整大小为什么会这样,为什么地图作者决定保留 capacity * load factor (12)?为什么不让容量(等于16)。

Map resize when we add 13th element why is it so, why author of map decided to keep it capacity * load factor (which is 12) ? Why not same as capacity (which is 16).

为什么不保持阈值等于容量,以便重新散列只发生在hashmap充满?

why not keep threshold value equal to capacity so that rehashing only takes place when hashmap gets full ?

推荐答案

Javadoc,Javadoc,Javadoc。这是你看的第一个地方。在 HashMap 它说:

Javadoc, Javadoc, Javadoc. That is the first place you look. On the HashMap it says:


作为一般规则,默认负载因子(.75)在时间和空间成本之间提供了良好的折衷
。更高的值减少了空间开销
,但增加了查找成本(反映在
HashMap类的大多数操作中,包括get和put)。在设置其初始容量时,在映射中的
期望的条目数量及其负载因子应该被考虑为
,以便最小化重新散列操作的
数量。如果初始容量比最大条目数除以负载因子大
,则不会发生
重新哈希操作。

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

对于散列图的理论 - 如果你的地图是满的,那么你在做一些非常,非常错误的事情。到那时,您可能在使用随机数据 - BAD 查找时 O(sqrt(N))。您从不希望您的哈希表已满。但是一个非常稀疏的地图将浪费太多的空间(如你所指出的),并且花费太长的时间来迭代。因此,应该是一个负载因子,在大多数情况下小于1.

As on the theory of hash maps - if your map is full, then you're doing something very, very wrong. By that time you're likely at O(sqrt(N)) on lookups with random data - BAD. You never want your hashmap to be full. But a very sparse map will waste too much space (as you've noted), and will take too long to iterate through. Hence there should be a load factor, that is less than 1 for most use cases.

注意:浪费空间地图的大小,并且与负载因子成反比。然而,查找时间具有更复杂的预期性能函数。这意味着相同的负载因子不适用于不同大小的哈希映射 - 因为这意味着不同的规模权衡。

Note: The "wasted space" is proportional to the size of the map, and inversely proportional to the load factor. However lookup times have a more complex expected performance function. This means that the same load factor will not work for different size hash maps - as it will mean different scale tradeoffs.

权衡的一般概述可以在KnuthThe Art of Computer Programmingvol 3中找到。

A general overview of the tradeoffs can be found in Knuth "The Art of Computer Programming" vol 3.

这篇关于在HashMap中为什么阈值(下一个要调整大小的大小值)是容量*负载因子。为什么不等于地图的大小或容量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆