Java 8 hashmap 高内存使用率 [英] Java 8 hashmap high memory usage

查看:29
本文介绍了Java 8 hashmap 高内存使用率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 hashmap 来存储 QTable 以实现强化学习算法.我的 hashmap 应该存储 15000000 个条目.当我运行我的算法时,我看到进程使用的内存超过 1000000K.当我计算内存时,我希望它使用不超过 530000K.我试着写一个例子,我得到了同样高的内存使用率:

I use a hashmap to store a QTable for an implementation of a reinforcement learning algorithm. My hashmap should store 15000000 entries. When I ran my algorithm I saw that the memory used by the process is over 1000000K. When I calculated the memory, I would expect it to use not more than 530000K. I tried to write an example and I got the same high memory usage:

public static void main(String[] args) {
    HashMap map = new HashMap<>(16_000_000, 1);
    for(int i = 0; i < 15_000_000; i++){
        map.put(i, i);
    }
}

我的记忆计算:

每个条目集是 32 字节
容量为 15000000
HashMap 实例使用:32 * SIZE + 4 * CAPACITY内存 = (15000000 * 32 + 15000000 * 4)/1024 = 527343.75K

Each entryset is 32 bytes
Capacity is 15000000
HashMap Instance uses: 32 * SIZE + 4 * CAPACITY memory = (15000000 * 32 + 15000000 * 4) / 1024 = 527343.75K

我的内存计算哪里错了?

Where I'm wrong in my memory calculations?

推荐答案

好吧,在最好的情况下,我们假设字长为 32 位/4 字节(使用 CompressedOops 和 CompressedClassesPointers).然后,一个映射条目由两个词 JVM 开销(klass 指针和标记词)、key、value、hashcode 和 next 指针组成,总共 6 个词,即 24 个字节.因此,拥有 15,000,000 个条目实例将消耗 360 MB.

Well, in the best case, we assume a word size of 32 bits/4 bytes (with CompressedOops and CompressedClassesPointers). Then, a map entry consists of two words JVM overhead (klass pointer and mark word), key, value, hashcode and next pointer, making 6 words total, in other words, 24 bytes. So having 15,000,000 entry instances will consume 360 MB.

此外,还有包含条目的数组.HashMap 使用的容量是 2 的幂,因此对于 15,000,000 个条目,数组大小至少为 16,777,216,消耗 64 MiB.

Additionally, there’s the array holding the entries. The HashMap uses capacities that are a power of two, so for 15,000,000 entries, the array size is at least 16,777,216, consuming 64 MiB.

然后,您有 30,000,000 个 Integer 实例.问题是 map.put(i, i) 执行 two 装箱操作,虽然鼓励 JVM 在装箱时重用对象,但这不是必需的,并且重用不会发生在您的简单程序中,该程序可能在优化器干扰之前完成.

Then, you have 30,000,000 Integer instances. The problem is that map.put(i, i) performs two boxing operations and while the JVM is encouraged to reuse objects when boxing, it is not required to do so and reusing won’t happen in your simple program that is likely to complete before the optimizer ever interferes.

准确地说,前 128 个 Integer 实例被重用,因为对于 -128 ... +127 范围内的值,共享是强制性的,但实现是这样做的通过在第一次使用时初始化整个缓存,因此对于第一次 128 迭代,它不会创建两个实例,但缓存由 256 个实例组成,这是两次这个数字,所以我们再次得到 30,000,000 个 Integer 实例.Integer 实例至少包含两个 JVM 特定的单词和实际的 int 值,这将是 12 个字节,但由于默认对齐方式,实际消耗的内存将为 16 字节,可被 8 整除.

To be precise, the first 128 Integer instances are reused, because for values in the -128 … +127 range, sharing is mandatory, but the implementation does this by initializing the entire cache on the first use, so for the first 128 iterations, it doesn’t create two instances, but the cache consists of 256 instances, which is twice that number, so we end up again with 30,000,000 Integer instances total. An Integer instance consist of at least the two JVM specific words and the actual int value, which would make 12 bytes, but due to the default alignment, the actually consumed memory will be 16 bytes, dividable by eight.

因此,创建的 30,000,000 个 Integer 实例消耗 480 MB.

So the 30,000,000 created Integer instances consume 480 MB.

这使得总共 360 MB + 64 MiB + 480 MB,超过 900 MB,使得 1 GB 的堆大小完全合理.

This makes a total of 360 MB + 64 MiB + 480 MB, which is more than 900 MB, making a heap size of 1 GB entirely plausible.

但这就是分析工具的用途.运行你的程序后,我得到了

But that’s what profiling tools are for. After running your program, I got

请注意,此工具仅报告对象的使用大小,即 Integer 对象的 12 字节,而不考虑在查看 JVM 分配的总内存时会注意到的填充.

Note that this tool only reports the used size of the objects, i.e. the 12 bytes for an Integer object without considering the padding that you will notice when looking at the total memory allocated by the JVM.

这篇关于Java 8 hashmap 高内存使用率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆