问题与散列地图空间 [英] Issue with Hash Map Space

查看:138
本文介绍了问题与散列地图空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的Java代码中,我使用Guava的Multimap( com.google.common.collect.Multimap ):

  Multimap<整数,整数> Index = HashMultimap.create()

这里,Multimap键是URL的一部分,值是另一个部分URL(转换为整数)。现在,我分配我的JVM 2560 Mb(2.5 GB)堆空间(使用Xmx和Xms)。但是,它只能存储9百万个这样的(键,值)整数对(约1000万)。现在,问题是,我可以提供JVM只有有限的内存量(比如2 GB)。

因此,任何人都可以帮助我,



1)是否有其他方法或家庭解决方案来解决此内存问题?意思是,基于磁盘/数据库的多地图可以是一个很好的解决方案?我从一些网络文章中读到,有一些基于DB / Disk的解决方案可以解决这个问题。 Berkley DB Ehcache 。任何人都可以通知我(或哪一个)更快?

2)这些基于磁盘/数据库的多地图存在性能问题(我要求存储和搜索)?

3)任何想法或信息如何使用这些简单的。

4)任何其他想法对我来说都会很好。



注意:我希望Multimap(键可以有多个值)解决上述问题。我也必须考虑存储和搜索的性能。 你肯定不会存储1亿对<$ c 2.5 GB内存中的$ c> Integer 对象。如果我没有弄错,Oracle / Sun JVM中的 Integer 将使用至少16个字节的内存(对齐也是16个字节),这意味着3.2 GB的内存内存为整数 s,没有任何结构。



有了这个数据大小,由磁盘支持,或者使用具有大量内存和/或优化数据结构的服务器(特别是尝试避免原始类型包装)。我已经使用 H2 进行类似的任务,并发现它很不错(它可以使用映射文件访问磁盘而不是读取),但我没有与其他类似的库进行比较。


In my Java code, I am using Guava's Multimap (com.google.common.collect.Multimap) by using this:

 Multimap<Integer, Integer> Index = HashMultimap.create()

Here, Multimap key is some portion of a URL and value is another portion of the URL (converted into an integer). Now, I assign my JVM 2560 Mb (2.5 GB) heap space (by using Xmx and Xms). However, it can only store 9 millions of such (key,value) pairs of integers (approx 10 million). Now, issue is, I can provide JVM only limited amount of memory (say 2 GB).

So, can anybody help me,

1) Is there another way or home-baked solution to solve this memory issue? Means, Is Disk/DB Based Multi-Map can be a nice solution ? I read from some web articles that there is some DB/Disk based solution to solve this issue ex. Berkley DB or Ehcache. Can anybody inform me whether (or which one) is faster ?

2) Is those Disk/DB Based Multi-Map has performance issue (I am asking for both storing and searching) ?

3) Any idea or information how to use those in brief.

4) Any other idea will be nice for me.

NB: I want Multimap (key can have multiple values)solutions for the above issue. And I have to consider performance of storing and searching also.

解决方案

You certainly won't store 100 million pairs of Integer objects in 2.5 GB of memory. If I'm not mistaken, an Integer will use at least 16 bytes of memory in Oracle/Sun JVM (and the alignment is also 16 bytes), which means 3.2 GB of memory for the Integers alone, without any structure.

With this data size you should definitely go with something which is backed by the disk, or use a server with lots of memory and/or optimized data structures (in particular try to avoid primitive type wrappers). I have used H2 for similar tasks and found it quite good (it can use mapped files to access the disk instead of reads), but I don't have any comparison with other similar libraries.

这篇关于问题与散列地图空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆