具有不同初始容量和负载因子的HashMap的性能 [英] Performance of HashMap with different initial capacity and load factor

查看:99
本文介绍了具有不同初始容量和负载因子的HashMap的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的情况。我使用两个java.util.HashMap将一些常用数据存储在Tomcat上运行的Java Web应用程序中。我知道每个Hashmap中的条目的确切数量。这些键分别是字符串和整数。



我的问题是,设置初始容量和负载因子的最佳方法是什么?

我是否应该将容量设置为它所具有的元素数量,并将容量设置为1.0?我希望在不使用太多内存的情况下获得最佳性能。但是,我担心,桌子不会达到最佳状态。用所需的确切大小的表格,是否会发生关键碰撞,导致(通常很短)的扫描找到正确的元素?假设(而且这是一个延伸)散列函数是整数键的一个简单模5,这并不意味着键5,10,15会击中同一个桶,然后导致寻求填充旁边的桶?较大的初始容量是否会提高性能?



另外,如果有一个比hashmap更好的数据结构,我完全同意这一点。

解决方案

在缺乏数据的完美散列函数的情况下,假设这实际上不是微不足道的事情的微优化,我会尝试以下方法:

假设在大多数情况下,HashMap使用的默认加载容量(.75)是一个很好的值。既然如此,你可以使用它,并且根据你自己的知识来设置你的HashMap的初始容量 - 设置它以使初始容量x.75 =项目数量(舍入)。

如果是更大的地图,那么在高速查找非常重要的情况下,我会建议使用某种 trie ,而不是哈希映射。对于长字符串,在大型地图中,通过使用更多的面向字符串的数据结构(如trie),可以节省空间和一段时间。


Here is my situation. I am using two java.util.HashMap to store some frequently used data in a Java web app running on Tomcat. I know the exact number of entries into each Hashmap. The keys will be strings, and ints respectively.

My question is, what is the best way to set the initial capacity and loadfactor?

Should I set the capacity equal to the number of elements it will have and the load capacity to 1.0? I would like the absolute best performance without using too much memory. I am afraid however, that the table would not fill optimally. With a table of the exact size needed, won't there be key collision, causing a (usually short) scan to find the correct element?

Assuming (and this is a stretch) that the hash function is a simple mod 5 of the integer keys, wouldn't that mean that keys 5, 10, 15 would hit the same bucket and then cause a seek to fill the buckets next to them? Would a larger initial capacity increase performance?

Also, if there is a better datastructure than a hashmap for this, I am completely open to that as well.

解决方案

In the absence of a perfect hashing function for your data, and assuming that this is really not a micro-optimization of something that really doesn't matter, I would try the following:

Assume the default load capacity (.75) used by HashMap is a good value in most situations. That being the case, you can use it, and set the initial capacity of your HashMap based on your own knowledge of how many items it will hold - set it so that initial-capacity x .75 = number of items (round up).

If it were a larger map, in a situation where high-speed lookup was really critical, I would suggest using some sort of trie rather than a hash map. For long strings, in large maps, you can save space, and some time, by using a more string-oriented data structure, such as a trie.

这篇关于具有不同初始容量和负载因子的HashMap的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆