Multimap空间问题:番石榴 [英] Multimap Space Issue: Guava
问题描述
在我的Java代码中,我使用Guava的Multimap( com.google.common.collect.Multimap ):
Multimap< Integer ,整数> Index = HashMultimap.create()
这里,Multimap键是URL的一部分,值是另一个部分URL(转换为整数)。现在,我分配我的JVM 2560 Mb(2.5 GB)堆空间(使用Xmx和Xms)。但是,它只能存储9百万个这样的(键,值)整数对(约1000万)。但是,理论上(根据 int
占用的内存)它应该存储更多。
任何人都可以帮助我, / p>
- 为什么
Multimap
使用大量内存?我检查了我的代码,并且没有将对插入到Multimap
中,它只使用了1/2 MB的内存。
2.
是否有解决这个内存问题的另一种方法或家庭解决方案?意味着,是否有任何方法来减少这些对象开销只存储int-int?在任何其他语言?或者任何其他解决方案(首选)解决我面临的问题,意味着基于数据库或类似的解决方案。 解决方案
与 Multimap
相关的开销很大。至少:
- 每个键和值都是
Integer
对象,它(至少)将每个int
值的存储需求加倍。
HashMultimap中的每个唯一键值
与一个集合
值关联(根据来源,集合
是Hashset
)。
- 每个
Hashset
都使用默认空间创建对于8个值。
因此,每个键/值对都需要(至少)比您预期的要多一个数量级的空间对于两个int
值。 (当多个值存储在一个键下时会少一些)。我预计1000万个键/值对可能需要400MB。
虽然你有2.5GB的堆空间,如果这还不够,我不会感到惊讶。我认为上述估计偏低。另外,它只是说明一旦建立地图后需要存储多少地图。随着地图的增长,表格需要重新分配和重新编制,暂时至少使用的空间增加一倍。最后,所有这些都假定
int
值和对象引用需要4个字节。如果JVM使用64位寻址,则字节数可能会加倍。In my Java code, I am using Guava's Multimap (com.google.common.collect.Multimap) by using this:
Multimap<Integer, Integer> Index = HashMultimap.create()
Here, Multimap key is some portion of a URL and value is another portion of the URL (converted into an integer). Now, I assign my JVM 2560 Mb (2.5 GB) heap space (by using Xmx and Xms). However, it can only store 9 millions of such (key,value) pairs of integers (approx 10 million). But, theoretically (according to memory occupied by
int
) it should store more.Can anybody help me,
- Why is
Multimap
using lots of memory? I checked my code and without inserting pairs into theMultimap
, it only uses 1/2 MB of memory.
2.
Is there another way or home-baked solution to solve this memory issue? Means, Is there any way to reduce those object overheads as I want to store only int-int? In any other language ? Or any other solution (home-baked preferred) to solve issue I faced, means DB based or something like that solution.
解决方案There's a huge amount of overhead associated with
Multimap
. At a minimum:- Each key and value is an
Integer
object, which (at a minimum) doubles the storage requirements of eachint
value. - Each unique key value in the
HashMultimap
is associated with aCollection
of values (according to the source, theCollection
is aHashset
). - Each
Hashset
is created with default space for 8 values.
So each key/value pair requires (at a minimum) perhaps an order of magnitude more space than you might expect for two
int
values. (Somewhat less when multiple values are stored under a single key.) I would expect 10 million key/value pairs to take perhaps 400MB.Although you have 2.5GB of heap space, I wouldn't be all that surprised if that's not enough. The above estimate is, I think, on the low side. Plus, it only accounts for how much is needed to store the map once it is built. As the map grows, the table needs to be reallocated and rehashed, which temporarily at least doubles the amount of space used. Finally, all this assumes that
int
values and object references require 4 bytes. If the JVM is using 64-bit addressing, the byte count probably doubles.这篇关于Multimap空间问题:番石榴的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!