为什么Hashtable的initialCapacity是11,而HashMap中的DEFAULT_INITIAL_CAPACITY是16,需要2的幂 [英] Why initialCapacity of Hashtable is 11 while the DEFAULT_INITIAL_CAPACITY in HashMap is 16 and requires a power of 2

查看:3162
本文介绍了为什么Hashtable的initialCapacity是11,而HashMap中的DEFAULT_INITIAL_CAPACITY是16,需要2的幂的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在jdk 1.6中比较 HashMap Hashtable 源代码,我在HashMap中看到了下面的代码

  / ** 
*默认初始容量 - 必须是2的乘方。
* /
static final int DEFAULT_INITIAL_CAPACITY = 16;

int capacity = 1;
while(容量容量<= 1;

然而,在Hashtable中,我看到了下面的代码?

  table = new条目[initialCapacity]; 
$ b $ public Hashtable(){
this(11,0.75f);





$ b

所以我的问题是:
为什么hashMap需要2的幂作为初始容量?而散列表选择11作为默认初始容量?
我认为这与散列表是线程安全的并且不允许空键或值无关。



thx。

解决方案

详细信息: HashMap需要更好的hashCode() - JDK 1.4 Part II



根据该文章,转向两种尺寸幂的主要原因是位掩码比整数除法更快。这不是没有不良后果,这是由一位原作者解释的:


Joshua Bloch :使用二的幂次方的缺点是生成的哈希表对质量非常敏感的散列函数(hashCode)。输入中的任何更改都必须影响散列值的低位。 (理想情况下,它应该以相同的可能性影响散列值的所有位)。因为我们不能确定这是真的,所以当我们切换到二的幂次方时,我们放置了次要(或防御性)散列函数哈希表。这个散列函数在屏蔽低位之前应用于hashCode的结果。它的工作是将信息分散到所有的位,特别是低位位。当然,它必须快速运行,否则您将失去切换到两个幂次表的功能的好处。 1.4中的原始二级散列函数结果不足。我们知道这是一种理论上的可能性,但我们认为它不会影响任何实际的数据集。我们错了。替换的二级散列函数(我使用计算机开发的)具有很强的统计特性,几乎可以保证桶的分布。


Comparison of HashMap and Hashtable source code in jdk 1.6, I saw below codes inside HashMap

/**
 * The default initial capacity - MUST be a power of two.
 */
static final int DEFAULT_INITIAL_CAPACITY = 16;

    int capacity = 1;
    while (capacity < initialCapacity)
        capacity <<= 1;

however, in Hashtable,i saw below codes?

table = new Entry[initialCapacity];

public Hashtable() {
    this(11, 0.75f);
}

so my question is: why hashMap requires a power of 2 as initial capacity? and while hashtable choose 11 as the default initial capacity? i assume this has nothing to do with the thing that hashtable is thread safe and does not allow null key or values.

thx.

解决方案

The following article addresses this question in some detail: HashMap requires a better hashCode() - JDK 1.4 Part II.

According to that article, the main reason to move to power-of-two sizes was that bit masking is faster than integer division. This is not without adverse consequences, which are explained by one of the original authors:

Joshua Bloch: The downside of using a power-of-two is that the resulting hash table is very sensitive to the quality of the hash function (hashCode). It is imperative that any change in the input must affect the low order bits of the hash value. (Ideally, it should affect all bits of the hash value with equal likelihood.) Because we have no assurance that this is true, we put in a secondary (or "defensive") hash function when we switched to the power-of-two hash table. This hash function is applied to the results of hashCode before masking off the low order bits. Its job is to scatter the information over all the bits, and in particular, into the low order bits. Of course it has to run very fast, or you lose the benefit of switching to the power-of-two-sized table. The original secondary hash function in 1.4 turned out to be insufficient. We knew that this was a theoretical possibility, but we thought that it didn't affect any practical data sets. We were wrong. The replacement secondary hash function (which I developed with the aid of a computer) has strong statistical properties that pretty much guarantee good bucket distribution.

这篇关于为什么Hashtable的initialCapacity是11,而HashMap中的DEFAULT_INITIAL_CAPACITY是16,需要2的幂的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆