HashMap替代内存高效数据存储 [英] HashMap alternatives for memory-efficient data storage

查看:439
本文介绍了HashMap替代内存高效数据存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前有一个电子表格类型程序,它的数据保存在一个ArrayList的HashMaps。当我告诉你,这没有被证明是理想的,你毫无疑问会被震惊。开销似乎比数据本身使用的内存多5倍。

I've currently got a spreadsheet type program that keeps its data in an ArrayList of HashMaps. You'll no doubt be shocked when I tell you that this hasn't proven ideal. The overhead seems to use 5x more memory than the data itself.

这个问题询问有效的集合库,答案是使用Google Collections。 我的跟进是哪个部分?。我一直在阅读文档,但不觉得它给了一个很好的感觉,哪些类是一个很好的适合这一点。 (我也打开其他库或建议)。

This question asks about efficient collections libraries, and the answer was use Google Collections. My follow up is "which part?". I've been reading through the documentation but don't feel like it gives a very good sense of which classes are a good fit for this. (I'm also open to other libraries or suggestions).

所以我在寻找一些东西,让我存储密集的电子表格类型的数据,最小的内存开销。

So I'm looking for something that will let me store dense spreadsheet-type data with minimal memory overhead.


  • 我的列目前由字段对象引用,行按其索引引用,值为Objects,几乎总是字符串

  • 某些列会有很多重复的值

  • 主要操作是根据特定字段的值更新或删除记录,以及添加/删除/ li>
  • My columns are currently referenced by Field objects, rows by their indexes, and values are Objects, almost always Strings
  • Some columns will have a lot of repeated values
  • primary operations are to update or remove records based on values of certain fields, and also adding/removing/combining columns

我知道H2和Derby等选项,但在这种情况下,我不打算使用嵌入式数据库。

I'm aware of options like H2 and Derby but in this case I'm not looking to use an embedded database.

EDIT :如果你建议图书馆,我也很感激你,如果你能指出我在这里适用的特定类别。而Sun的文档通常包括有关哪些操作是O(1),它们是O(N)等的信息,我在第三方库中没有看到很多,也没有真正描述哪些类最适合什么。

EDIT: If you're suggesting libraries, I'd also appreciate it if you could point me to a particular class or two in them that would apply here. Whereas Sun's documentation usually includes information about which operations are O(1), which are O(N), etc, I'm not seeing much of that in third-party libraries, nor really any description of which classes are best suited for what.

推荐答案

所以我假设你有一张地图 Map< ColumnName,Column& / code>,其中列实际上是 ArrayList< Object>

So I'm assuming that you have a map of Map<ColumnName,Column>, where the column is actually something like ArrayList<Object>.

几种可能性 -


  • 是一个问题?如果你只是一般担心大小,这是值得肯定的,这将是一个正在运行的程序中的问题。它需要大量的行和地图来填充JVM。

  • Are you completely sure that memory is an issue? If you're just generally worried about size it'd be worth confirming that this will really be an issue in a running program. It takes an awful lot of rows and maps to fill up a JVM.

您可以使用集合中的不同类型的地图测试您的数据集。根据您的数据,您还可以使用预设大小/负载因子组合来初始化地图,这可能会有所帮助。

You could test your data set with different types of maps in the collections. Depending on your data, you can also initialize maps with preset size/load factor combinations that may help. I've messed around with this in the past, you might get a 30% reduction in memory if you're lucky.

如果您将资料储存在资料库中,您可能会在记忆体中减少30%一个单一的矩阵式数据结构(一个现有的库实现或类似列表列表的包装器),用一个映射将列键映射到矩阵列?

What about storing your data in a single matrix-like data structure (an existing library implementation or something like a wrapper around a List of Lists), with a single map that maps column keys to matrix columns?

这篇关于HashMap替代内存高效数据存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆