如何计算 Java 中的 HashMap 内存使用量? [英] How to calculate HashMap memory usage in Java?

查看:28
本文介绍了如何计算 Java 中的 HashMap 内存使用量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一次采访中被要求计算 HashMap 的内存使用情况,以及如果其中有 200 万个项目,它将消耗多少内存.

I was asked in an interview to calculate the memory usage for HashMap and how much estimated memory it will consume if you have 2 million items in it.

例如:

Map <String,List<String>> mp=new HashMap <String,List<String>>();

映射是这样的.

key   value
----- ---------------------------
abc   ['hello','how']
abz   ['hello','how','are','you']

我如何估计这个 HashMap 对象在 Java 中的内存使用情况?

How would I estimate the memory usage of this HashMap Object in Java?

推荐答案

简答

要了解对象有多大,我会使用分析器.例如,在 YourKit 中,您可以搜索对象,然后获取它以计算其深度大小.如果对象是独立的并且是对象的保守大小,这将使您大致了解将使用多少内存.

To find out how large an object is, I would use a profiler. In YourKit, for example, you can search for the object and then get it to calculate its deep size. This will give a you a fair idea of how much memory would be used if the object were stand alone and is a conservative size for the object.

狡辩

如果对象的某些部分在其他结构中重复使用,例如字符串文字,你不会通过丢弃它来释放这么多内存.事实上,丢弃对 HashMap 的一个引用可能根本不会释放任何内存.

If parts of the object are re-used in other structures e.g. String literals, you won't free this much memory by discarding it. In fact discarding one reference to the HashMap might not free any memory at all.

序列化怎么样?

序列化对象是获得估计值的一种方法,但由于序列化开销和编码在内存和字节流中不同,因此可能会非常糟糕.使用多少内存取决于JVM(以及是否使用32/64位引用),但序列化格式始终相同.

Serialising the object is one approach to getting an estimate, but it can be wildly off as the serialisation overhead and encoding is different in memory and to a byte stream. How much memory is used depends on the JVM (and whether its using 32/64-bit references), but the Serialisation format is always the same.

例如

在 Sun/Oracle 的 JVM 中,一个 Integer 可以占用 16 个字节的头部,4 个字节的数字和 4 个字节的填充(对象在内存中是 8 字节对齐的),总共 24 个字节.然而,如果你序列化一个整数,它需要 81 个字节,序列化两个整数,它们需要 91 个字节.即第一个整数的大小被夸大,第二个整数小于内存中使用的大小.

In Sun/Oracle's JVM, an Integer can take 16 bytes for the header, 4 bytes for the number and 4 bytes padding (the objects are 8-byte aligned in memory), total 24 bytes. However if you serialise one Integer, it takes 81 bytes, serialise two integers and they takes 91 bytes. i.e. the size of the first Integer is inflated and the second Integer is less than what is used in memory.

字符串是一个复杂得多的例子.在 Sun/Oracle JVM 中,它包含 3 个 int 值和一个 char[] 引用.所以你可能会假设它使用 16 字节的头加上 3 * 4 字节的 ints,4 字节的 char[],16 字节的 开销>char[] 然后每个字符两个字节,对齐到 8 字节边界...

String is a much more complex example. In the Sun/Oracle JVM, it contains 3 int values and a char[] reference. So you might assume it uses 16 byte header plus 3 * 4 bytes for the ints, 4 bytes for the char[], 16 bytes for the overhead of the char[] and then two bytes per char, aligned to 8-byte boundary...

哪些标志可以改变大小?

如果您有 64 位引用,char[] 引用的长度为 8 个字节,因此填充了 4 个字节.如果您有 64 位 JVM,则可以使用 +XX:+UseCompressedOops 来使用 32 位引用.(所以仅看 JVM 位大小并不能告诉您其引用的大小)

If you have 64-bit references, the char[] reference is 8 bytes long resulting in 4 bytes of padding. If you have a 64-bit JVM, you can use +XX:+UseCompressedOops to use 32-bit references. (So look at the JVM bit size alone doesn't tell you the size of its references)

如果您有 -XX:+UseCompressedStrings,JVM 将尽可能使用 byte[] 而不是 char 数组.这可能会稍微减慢您的应用程序的速度,但可能会显着改善您的内存消耗.使用 byte[] 时,消耗的内存为每个字符 1 个字节.;) 注意:对于 4 字符字符串,如示例中所示,由于 8 字节边界,使用的大小相同.

If you have -XX:+UseCompressedStrings, the JVM will use a byte[] instead of a char array when it can. This can slow down your application slightly but could improve you memory consumption dramatically. When a byte[] in used, the memory consumed is 1 byte per char. ;) Note: for a 4-char String, as in the example, the size used is the same due to the 8-byte boundary.

尺寸"是什么意思?

正如已经指出的那样,HashMap 和 List 更复杂,如果不是全部,可以重用字符串,可能是字符串文字.大小"的含义取决于它的使用方式.即该结构单独使用多少内存?如果丢弃该结构,将释放多少?如果复制结构会使用多少内存?这些问题可以有不同的答案.

As has been pointed out, HashMap and List is more complex as many, if not all, the Strings can be reused, possibly String literals. What you mean by "size" depends on how it is used. i.e. How much memory would the structure use alone? How much would be freed if the structure were discarded? How much memory would be used if you copied the structure? These questions can have different answers.

没有分析器你能做什么?

如果您可以确定可能的保守尺寸足够小,则确切尺寸无关紧要.保守的情况可能是您从头开始构造每个字符串和条目.(我只是说 HashMap 可能有 10 亿个条目的容量,即使它是空的.具有单个字符的字符串可以是具有 20 亿个字符的字符串的子字符串)

If you can determine that the likely conservative size, is small enough, the exact size doesn't matter. The conservative case is likely to where you construct every String and entry from scratch. (I only say likely as a HashMap can have capacity for 1 billion entries even though it is empty. Strings with a single char can be a sub-string of a String with 2 billion characters)

您可以执行 System.gc(),获取空闲内存,创建对象,执行另一个 System.gc() 并查看空闲内存减少了多少.您可能需要多次创建对象并取平均值.多次重复这个练习,但它可以给你一个公平的想法.

You can perform a System.gc(), take the free memory, create the objects, perform another System.gc() and see how much the free memory has reduced. You may need to create the object many times and take an average. Repeat this exercise many times, but it can give you a fair idea.

(顺便说一句,虽然 System.gc() 只是一个提示,但 Sun/Oracle JVM 默认每次都会执行一次 Full GC)

(BTW While System.gc() is only a hint, the Sun/Oracle JVM will perform a Full GC every time by default)

这篇关于如何计算 Java 中的 HashMap 内存使用量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆