ArrayList和HashSet插入性能测试结果让我困惑 [英] ArrayList and HashSet insert performance test result confuse me

查看:278
本文介绍了ArrayList和HashSet插入性能测试结果让我困惑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我wirte一个类来测试arraylist和hashset之间的插入性能,正如我所料,哈希表插入性能将比arraylist(可能是书欺骗我),但测试结果让我这么困惑

i wirte a class to test the insert performance between arraylist and hashset,as i expect ,the hashset insert performance will be much better than arraylist(maybe the book deceived me),but the test result make me so confused

    HashSet<String> hashSet = new HashSet<String>();

    long start = System.currentTimeMillis();
    for (int i = 0; i < 900000; i++) {
        hashSet.add(String.valueOf(i));
    }

    System.out.println("Insert HashSet Time: " + (System.currentTimeMillis() - start));


    ArrayList<String> arrayList = new ArrayList<String>();

    start = System.currentTimeMillis();

    for (int i = 0; i < 900000; i++) {
        arrayList.add(String.valueOf(i));
    }
    System.out.println("Insert ArrayList Time: " + (System.currentTimeMillis() - start));

result:
Insert HashSet Time: 978
Insert ArrayList Time: 287

我运行这个主metod很多次,结果没有更多的不同之处,插入数组列表的时间比插入哈希表的时间短很多
任何人都可以解释这个奇怪的结果。

i run this main metod many times and the result has no more different between this,the insert arraylist time is much shorter than insert hashset time can anybody explain this weird result.

推荐答案

数据结构和算法的确切性能特征高度依赖于机器和实现。但是,对于我来说, ArrayList 插入将比通过常量因子插入 HashSet 快得多。要插入 ArrayList ,您只需要在数组中的特定索引处设置一个值。要插入到哈希集中,您需要为插入的项目计算哈希码,并将其映射到数组索引,检查索引并根据查找结果执行某些操作,最后插入数组。此外, HashSet 将具有更差的内存位置,因此您将更频繁地获取缓存缺失。

Exact performance characteristics of datastructures and algorithms are highly machine- and implementation-specific. However, it doesn't seem surprising to me that ArrayList inserts would be faster than HashSet inserts by a constant factor. To insert into an ArrayList, you just need to set a value at a particular index in an array. To insert into a hash set, you need to compute a hashcode for the inserted item and map that to an array index, check that index and possibly perform some action based on what you find, and finally insert into the array. Furthermore the HashSet will have worse memory locality so you'll get cache misses more often.

这两个数据结构都需要做,但是两个数据结构都需要以大约相同的速率调整大小(由于重新散列,哈希表调整大小也可能更加昂贵,因为常数因子)。

There's also the question of array resizing, which both data structures will need to do, but both data structures will need to resize at about the same rate (and hash table resizing is probably more expensive by a constant factor, too, due to rehashing).

两种算法都是恒定的(预期的)时间,但是有一个散列表比数组列表更多的东西。所以,它是不​​奇怪的,它会慢一个恒定的因素。 (再次,确切的差别在很大程度上取决于机器和实现。)

Both algorithms are constant (expected) time, but there's a lot more stuff to do with a hash table than an array list. So it's not surprising that it would be slower by a constant factor. (Again, the exact difference is highly dependent on machine and implementation.)

这篇关于ArrayList和HashSet插入性能测试结果让我困惑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆