存储在磁盘上的HashMap从磁盘读回非常慢 [英] HashMap stored on disk is very slow to read back from disk

查看:182
本文介绍了存储在磁盘上的HashMap从磁盘读回非常慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个存储外部uid的HashMap,然后它为给定的uid存储了一个不同的id(内部为我们的应用程序)。



例如:


  • 123.345.432 = 00001
  • 123.354.433 = 00002


映射由uid检查以确保使用相同的内部标识。

DICOMUID2StudyIdentiferMap定义如下:

  private static Map DICOMUID2StudyIdentiferMap = Collections.synchronizedMap(new HashMap()); 

如果我们成功加载,加载会覆盖它,否则它将使用默认的空HashMap。



通过执行从磁盘读回:

  FileInputStream f =新的FileInputStream(studyUIDFile); 
ObjectInputStream s = new ObjectInputStream(f);

Map loadedMap =(Map)s.readObject();
DICOMUID2StudyIdentiferMap = Collections.synchronizedMap(loadedMap);

使用以下命令将HashMap写入磁盘:

  FileOutputStream f = new FileOutputStream(studyUIDFile); 
ObjectOutputStream s = new ObjectOutputStream(f);

s.writeObject(DICOMUID2StudyIdentiferMap);

我遇到的问题是,在Eclipse中本地运行性能很好,但是当应用程序运行时在机器上正常使用HashMap需要花费几分钟从磁盘加载。加载完成后,查看DICOMUID2StudyIdentiferMap.put(...,...)是否会返回值也需要很长时间才能检查以前的值。



我在两种情况下加载相同的地图对象,它的一个〜400kb文件。它包含的HashMap有大约3000个键值对。

为什么它在一台机器上很慢,但不在eclipse中?


该机器是一台运行XP的虚拟机,它最近才开始变慢读取HashMap,因此它必须与其大小相关,但400kb不是非常大我不认为。



任何建议欢迎,TIA

解决方案

不确定序列化你的地图是最好的选择。如果Map是基于磁盘的持久性,为什么不使用专为磁盘设计的库?查看京都内阁。它实际上是用c ++编写的,但有一个java API。我已经多次使用过它,它使用起来非常简单,速度非常快,并且可以缩放到很大。

东京内阁,京都的旧版本,但它基本相同: ;

....

字符串dir =/ path / to / my / dir /;
HDB哈希=新的HDB();

//打开哈希以进行读取/写入,如果磁盘上不存在则创建
if(!hash.open(dir +unigrams.tch,HDB.OWRITER | HDB。 OCREAT)){
抛出新的IOException(无法打开+ dir +unigrams.tch:+ hash.errmsg());
}

//向hash添加一些内容
hash.put(blah,my string);

//关闭它
hash.close();


I have a HashMap that stores external uids and then it stores a different id ( internal for our app ) that has been set for the given uid.

e.g:

  • 123.345.432=00001
  • 123.354.433=00002

The map is checked by uid to make sure the same internal id will be used. If something is resent to the application.

DICOMUID2StudyIdentiferMap defined as follows:

private static Map DICOMUID2StudyIdentiferMap = Collections.synchronizedMap(new HashMap());

The load however will overwrite it, if we successfully load, otherwise it will use the default empty HashMap.

Its read back from disk by doing:

FileInputStream f = new FileInputStream( studyUIDFile );  
ObjectInputStream s = new ObjectInputStream( f );

Map loadedMap = ( Map )s.readObject();
DICOMUID2StudyIdentiferMap = Collections.synchronizedMap( loadedMap );

The HashMap is written to disk using:

FileOutputStream f = new FileOutputStream( studyUIDFile );
ObjectOutputStream s = new ObjectOutputStream( f );

s.writeObject(DICOMUID2StudyIdentiferMap);

The issue I have is, locally running in Eclipse performance is fine, but when the application is running in normal use on a machine the HashMap is taking several minutes to load from disk. Once loaded it also takes a long time to check for a previous value by say seeing if DICOMUID2StudyIdentiferMap.put(..., ...) will return a value.

I load the same map object in both cases, its a ~400kb file. The HashMap that it contains has about ~3000 key-value pairs.

Why is it so slow on one machine, but not in eclipse?

The machine is a VM running XP it has only recently started becoming slow to read the HashMap, so it must be related to the size of it, however 400kb isn't very big I don't think.

Any advice welcome, TIA

解决方案

Not sure that serialising your Map is the best option. If the Map is disk-based for persistance, why not use a lib that's designed for disk? Check out Kyoto Cabinet. It's actually written in c++ but there is a java API. I've used it several times, it's very easy to use, very fast and can scale to a huge size.

This is an example I'm copy/pasting for Tokyo cabinet, the old version of Kyoto, but it's basically the same:

import tokyocabinet.HDB;

....

String dir = "/path/to/my/dir/";
HDB hash = new HDB();

// open the hash for read/write, create if does not exist on disk
if (!hash.open(dir + "unigrams.tch", HDB.OWRITER | HDB.OCREAT)) {
    throw new IOException("Unable to open " + dir + "unigrams.tch: " + hash.errmsg());
}

// Add something to the hash
hash.put("blah", "my string");

// Close it
hash.close();

这篇关于存储在磁盘上的HashMap从磁盘读回非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆