如何计算Java中的HashMap内存使用情况? [英] How to calculate HashMap memory usage in Java?

查看:268
本文介绍了如何计算Java中的HashMap内存使用情况?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一次采访中,我被问到计算 HashMap 的内存使用情况,以及如果您有200万个物品,它将消耗多少估计的内存。



例如:

 地图< String,List< String> ;> mp = new HashMap< String,List< String>>(); 

映射就像这样。

 键值
----- - 一个键作为字符串,一个字符串数组作为键。 -------------------------
abc ['hello','how']
abz ['hello','如何','是','你']

我如何估计这个HashMap对象在Java?

解决方案

简短的回答

为了找出对象有多大,我将使用一个分析器。例如,在YourKit中,您可以搜索该对象,然后计算其深度大小。如果对象是独立的,并且对象是保守的大小,这将给你一个公平的想法。如果对象是独立的,并且是对象的保守大小,那么将使用多少内存。



如果对象的某些部分在其他结构中被重新使用,例如字符串文字,你不会通过丢弃它释放这么多的内存。实际上,放弃对HashMap的引用可能根本不会释放任何内存。



序列化怎么样?

序列化对象是获得估计值的一种方法,但它可能会被忽略,因为序列化开销和编码在内存和字节流中都不相同。使用多少内存取决于JVM(以及它是否使用32/64位引用),但序列化格式始终是相同的。



例如



在Sun / Oracle的JVM中,整数可以占用16个字节的数据头,4个字节的数量和4个字节的填充(对象在内存中对齐8个字节) 24个字节。但是,如果您序列化一个整数,则需要81个字节,将两个整数串行化并且需要91个字节。即第一个Integer的大小是膨胀的,第二个Integer小于内存中使用的大小。

字符串是一个更复杂的例子。在Sun / Oracle JVM中,它包含3 int 值和 char [] 引用。所以你可能会认为它为 int s使用了16字节头和3 * 4字节,而对于 char [] ,16字节用于 char [] 的开销,然后每个字符两个字节,与8字节的边界对齐...



什么标志可以改变大小?



如果你有64位引用, char [] 引用长度为8个字节,导致4个字节的填充。如果您有64位JVM,则可以使用 + XX:+ UseCompressedOops 来使用32位引用。 (所以看看JVM的大小本身并没有告诉你它的引用大小)

如果你有 -XX:+ UseCompressedStrings ,JVM在可能时将使用byte []而不是char数组。这可能会稍微减慢您的应用程序,但可以显着提高内存消耗。当使用一个字节[]时,消耗的内存是每个字符1个字节。 ;)注意:对于一个4字符的字符串,如示例所示,由于8字节的边界,所使用的大小相同。



size是什么意思? b
$ b

out,HashMap和List更复杂,因为很多(如果不是全部的话)字符串可以被重用,可能是String字面量。 大小的含义取决于它的使用方式。即结构单独使用多少内存?如果结构被丢弃了多少将被释放?如果复制结构,将使用多少内存?这些问题可以有不同的答案。



没有探查器的情况下可以做什么?



<如果你能确定可能的保守尺寸足够小,确切的尺寸并不重要。保守的情况很可能是你从零开始构建每个String和条目的地方。 (我只是说可能是因为HashMap可以容纳10亿个条目,即使它是空的。具有单个字符的字符串可以是具有20亿个字符的字符串的子字符串)



您可以执行System.gc(),获取可用内存,创建对象,执行另一个System.gc()并查看可用内存减少了多少。您可能需要多次创建对象并取平均值。重复这个练习很多次,但它可以给你一个公平的想法。

(顺便说一下,虽然System.gc()只是一个提示,Sun / Oracle JVM将执行一个完整的GC每次默认)


I was asked in an interview to calculate the memory usage for HashMap and how much estimated memory it will consume if you have 2 million items in it.

For example:

Map <String,List<String>> mp=new HashMap <String,List<String>>();

The mapping is like this. One key as string an an array of strings as a key.

key   value
----- ---------------------------
abc   ['hello','how']
abz   ['hello','how','are','you']

How would I estimate the memory usage of this HashMap Object in Java?

解决方案

The short answer

To find out how large an object is, I would use a profiler. In YourKit, for example, you can search for the object and then get it to calculate its deep size. This will give a you a fair idea of how much memory would be used if the object were stand alone and is a conservative size for the object.

The quibbles

If parts of the object are re-used in other structures e.g. String literals, you won't free this much memory by discarding it. In fact discarding one reference to the HashMap might not free any memory at all.

What about Serialisation?

Serialising the object is one approach to getting an estimate, but it can be wildly off as the serialisation overhead and encoding is different in memory and to a byte stream. How much memory is used depends on the JVM (and whether its using 32/64-bit references), but the Serialisation format is always the same.

e.g.

In Sun/Oracle's JVM, an Integer can take 16 bytes for the header, 4 bytes for the number and 4 bytes padding (the objects are 8-byte aligned in memory), total 24 bytes. However if you serialise one Integer, it takes 81 bytes, serialise two integers and they takes 91 bytes. i.e. the size of the first Integer is inflated and the second Integer is less than what is used in memory.

String is a much more complex example. In the Sun/Oracle JVM, it contains 3 int values and a char[] reference. So you might assume it uses 16 byte header plus 3 * 4 bytes for the ints, 4 bytes for the char[], 16 bytes for the overhead of the char[] and then two bytes per char, aligned to 8-byte boundary...

What flags can change the size?

If you have 64-bit references, the char[] reference is 8 bytes long resulting in 4 bytes of padding. If you have a 64-bit JVM, you can use +XX:+UseCompressedOops to use 32-bit references. (So look at the JVM bit size alone doesn't tell you the size of its references)

If you have -XX:+UseCompressedStrings, the JVM will use a byte[] instead of a char array when it can. This can slow down your application slightly but could improve you memory consumption dramatically. When a byte[] in used, the memory consumed is 1 byte per char. ;) Note: for a 4-char String, as in the example, the size used is the same due to the 8-byte boundary.

What do you mean by "size"?

As has been pointed out, HashMap and List is more complex as many, if not all, the Strings can be reused, possibly String literals. What you mean by "size" depends on how it is used. i.e. How much memory would the structure use alone? How much would be freed if the structure were discarded? How much memory would be used if you copied the structure? These questions can have different answers.

What can you do without a profiler?

If you can determine that the likely conservative size, is small enough, the exact size doesn't matter. The conservative case is likely to where you construct every String and entry from scratch. (I only say likely as a HashMap can have capacity for 1 billion entries even though it is empty. Strings with a single char can be a sub-string of a String with 2 billion characters)

You can perform a System.gc(), take the free memory, create the objects, perform another System.gc() and see how much the free memory has reduced. You may need to create the object many times and take an average. Repeat this exercise many times, but it can give you a fair idea.

(BTW While System.gc() is only a hint, the Sun/Oracle JVM will perform a Full GC every time by default)

这篇关于如何计算Java中的HashMap内存使用情况?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆