为什么在冲突或最坏情况下HashMap调整大小 [英] Why HashMap resize In case of collision or worst case

查看:113
本文介绍了为什么在冲突或最坏情况下HashMap调整大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只在Java版本1.7之前问这个问题.我正在使用反射来找出HashMap的当前容量.在下面的程序中,将12个唯一的人放入HashMap的单个存储桶中(使用相同的哈希码).然后,我将第13个唯一的人放在相同或不同的存储桶中(使用相同或不同的哈希码).在这两种情况下,添加第13个元素后,HashMap的大小都会调整为32个存储桶.我知道由于负载因子0.75和16的初始容量,HashMap的大小调整为第13个元素的两倍.但是仍然有空的存储桶,第13个元素仅使用2个存储桶.

I am asking this question with respect to java version till 1.7 only. I am using reflection to find out current capacity of HashMap. In below program am putting 12 unique person into a single bucket of HashMap (using same hashcode) . Then i am putting 13th unique person on same or different bucket(using same or different hashcodes). In both the cases after adding this 13th element, HashMap resizes to 32 buckets. i understand that due to load factor .75 and initial capacity 16 HashMap resizes to it's double with 13th element. But there are still empty buckets available and only 2 buckets are used for these 13th element.

我的问题是:

1)我的理解正确吗?我没有犯任何错误.这是HashMap的预期行为吗?

1) Is my understanding correct. Am i not making any mistake. Is this the expected behavior of HashMap .

2)如果所有这些都正确,那么即使有12或11个可用存储桶,为什么在这种情况下也需要将HashMap的第13个元素加倍.调整HashMap的大小不是额外的开销,也不是昂贵的.在这种情况下,需要将HashMap加倍吗?虽然可以根据哈希码将13th放在任何可用的存储桶中.

2) If all this is correct then even though there are 12 or 11 free buckets why the need to double the HashMap with 13th element in this case. Isn't it extra overhead or costly to resize the HashMap. What is the need to double the HashMap in this case While 13th can be put in any avalable bucket according to hashcode.

public class HashMapTest {
    public static void main(String[] args) throws NoSuchFieldException,
            SecurityException, IllegalArgumentException, IllegalAccessException {
        HashMap<Person, String> hm = new HashMap<Person, String>();
        for (int i = 1; i <= 12; i++) {
            // 12 Entry in same bucket(linkedlist)
            hm.put(new Person(), "1");
        }
        System.out.println("Number of Buckets in HashMap : "+bucketCount(hm));
        System.out.println("Number of Entry in HashMap :  " + hm.size());
        System.out.println("**********************************");
        // 13th element in different bucket
        hm.put(new Person(2), "2");
        System.out.println("Number of Buckets in HashMap : "+bucketCount(hm));
        System.out.println("Number of Entry in HashMap :  " + hm.size());
    }
    public static int bucketCount(HashMap<Person, String> h)
            throws NoSuchFieldException, SecurityException,
            IllegalArgumentException, IllegalAccessException {
        Field tableField = HashMap.class.getDeclaredField("table");
        tableField.setAccessible(true);
        Object[] table = (Object[]) tableField.get(h);
        return table == null ? 0 : table.length;
    }
}

class Person {
    int age = 0;
    Person() {
    }
    Person(int a) {
        age = a;
    }
    @Override
    public boolean equals(Object obj) {
        return false;
    }
    @Override
    public int hashCode() {
        if (age != 0) {
            return 1;
        } else {
            return age;
        }
    }
}

输出

Number of Buckets in HashMap : 16
Number of Entry in HashMap :  12
**********************************
Number of Buckets in HashMap : 32
Number of Entry in HashMap :  13

推荐答案

  1. 是的,这是预期的行为.
  2. HashMap不在乎使用了多少个存储桶.它仅知道已达到负载系数,因此发生碰撞的可能性变得太大,因此应该调整地图的大小.即使发生了许多碰撞,调整地图大小实际上也可以解决该问题.并不是因为您是故意选择相同的hashCode,而是在更现实的情况下,hashCodes应该具有更好的分布.如果您故意选择错误的hashCode,则HashMap不会做任何事情来提高自身的效率,并且增加复杂性来处理极端情况也没有意义,这种情况永远都不会发生,并且HashMap也无法修复./li>
  1. yes, and this is the expected behavior.
  2. The HashMap doesn't care about how many buckets are used. It only knows that the load factor has been reached, and that the probability of having collisions is thus becoming too big, and the map should thus be resized. Even though many collisions already happened, resizing the map could actually fix that. Not in your case, since you chose identical hashCode on purpose, but in a more realistic case, hashCodes should have a much better distribution. HashMap can't do anything to make itself efficient if you choose bad hashCodes on purpose, and there is no point in adding complexity to handle an extreme case, that should never happen, and that HashMap won't be able to fix anyway.

这篇关于为什么在冲突或最坏情况下HashMap调整大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆