Java 8 hashmap高内存使用率 [英] Java 8 hashmap high memory usage
问题描述
我使用散列表来存储QTable,以实现强化学习算法。我的hashmap应该存储15000000条目。当我运行我的算法时,我看到进程使用的内存超过1000000K。当我计算内存时,我希望它使用不超过530000K。我试图写一个例子,我得到了相同的高内存使用情况:
public static void main(String [] args){
HashMap map = new HashMap<>(16_000_000,1);
for(int i = 0; i <15_000_000; i ++){
map.put(i,i);
我的记忆调整:
每个入口集是32个字节
容量是15000000
HashMap实例使用:32 * SIZE + 4 * CAPACITY
memory =(15000000 * 32 + 15000000 * 4)/ 1024 = 527343.75K
在我的记忆计算中,我的错在哪里?
那么,在最好的情况下,我们假设32位/ 4字节的字大小(使用CompressedOops和CompressedClassesPointers)。然后,一个映射条目由两个单词JVM开销(klass指针和标记字),键,值,散列码和下一个指针组成,共计6个字,换句话说,是24个字节。因此,拥有15,000,000个条目实例将消耗360MB。
另外,还有一个包含条目的数组。 HashMap
使用的是2的幂的容量,因此对于15,000,000个条目,数组大小至少为16,777,216,消耗64 MiB。
然后,您有30,000,000个 Integer
实例。问题是, map.put(i,i)
会执行两个装箱操作,而JVM被鼓励在装箱时重用对象,它是不需要这么做,重复使用不会在优化器干扰之前完成的简单程序中发生。
准确地说,第一个128 Integer
实例被重用,因为对于 -128 ... +127
范围中的值,共享是强制性的,但是实现是这样做的通过在第一次使用时初始化整个缓存,对于第一次 128
迭代,它不创建两个实例,但缓存由 256
实例,它是该数字的两倍,因此我们再次以30,000,000 Integer
实例由至少两个JVM特定字和实际的 int
值组成,但由于默认对齐,实际使用的内存将是16个字节,可以被8除。
Integer
b $ b
这使得总共360 MB + 64 MiB + 480 MB,超过900 MB,使堆大小为1 GB完全合理。
但这就是分析工具的用处。运行你的程序后,我得到了
< img src =https://i.stack.imgur.com/BwRHR.pngalt =按大小排序的已用内存>
请注意,此工具仅报告对象的已用大小,即 Integer
对象的12个字节,而不考虑在查看由JVM。
I use a hashmap to store a QTable for an implementation of a reinforcement learning algorithm. My hashmap should store 15000000 entries. When I ran my algorithm I saw that the memory used by the process is over 1000000K. When I calculated the memory, I would expect it to use not more than 530000K. I tried to write an example and I got the same high memory usage:
public static void main(String[] args) {
HashMap map = new HashMap<>(16_000_000, 1);
for(int i = 0; i < 15_000_000; i++){
map.put(i, i);
}
}
My memory calulation:
Each entryset is 32 bytes
Capacity is 15000000
HashMap Instance uses: 32 * SIZE + 4 * CAPACITY
memory = (15000000 * 32 + 15000000 * 4) / 1024 = 527343.75K
Where I'm wrong in my memory calculations?
Well, in the best case, we assume a word size of 32 bits/4 bytes (with CompressedOops and CompressedClassesPointers). Then, a map entry consists of two words JVM overhead (klass pointer and mark word), key, value, hashcode and next pointer, making 6 words total, in other words, 24 bytes. So having 15,000,000 entry instances will consume 360 MB.
Additionally, there’s the array holding the entries. The HashMap
uses capacities that are a power of two, so for 15,000,000 entries, the array size is at least 16,777,216, consuming 64 MiB.
Then, you have 30,000,000 Integer
instances. The problem is that map.put(i, i)
performs two boxing operations and while the JVM is encouraged to reuse objects when boxing, it is not required to do so and reusing won’t happen in your simple program that is likely to complete before the optimizer ever interferes.
To be precise, the first 128 Integer
instances are reused, because for values in the -128 … +127
range, sharing is mandatory, but the implementation does this by initializing the entire cache on the first use, so for the first 128
iterations, it doesn’t create two instances, but the cache consists of 256
instances, which is twice that number, so we end up again with 30,000,000 Integer
instances total. An Integer
instance consist of at least the two JVM specific words and the actual int
value, which would make 12 bytes, but due to the default alignment, the actually consumed memory will be 16 bytes, dividable by eight.
So the 30,000,000 created Integer
instances consume 480 MB.
This makes a total of 360 MB + 64 MiB + 480 MB, which is more than 900 MB, making a heap size of 1 GB entirely plausible.
But that’s what profiling tools are for. After running your program, I got
Note that this tool only reports the used size of the objects, i.e. the 12 bytes for an Integer
object without considering the padding that you will notice when looking at the total memory allocated by the JVM.
这篇关于Java 8 hashmap高内存使用率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!