反序列化后Hashmap变慢 - 为什么? [英] Hashmap slower after deserialization - Why?

查看:110
本文介绍了反序列化后Hashmap变慢 - 为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常大的Hashmap(〜250MB)。创建它需要大约50-55秒,所以我决定将其序列化并保存到文件中。从文件中读取大约需要16-17秒。



唯一的问题是查找速度似乎比较慢。我一直认为hashmap是从文件读入内存的,所以性能应该与我自己创建hashmap的情况相同,对吧?下面是我用来将hashmap读入文件的代码:

  File file = new File(omaha.ser ); 
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
omahaMap =(HashMap< Long,Integer>)s.readObject();
s.close();

当我自己创建散列表时,3亿次查找大约需要3.1秒,而当我读取大约8.5秒时来自文件的相同散列映射。有人知道为什么吗?我忽略了一些明显的东西?



编辑: b
$ b

我只是花时间测量时间与System.nanotime(),所以没有适当的基准方法使用。这里是代码:

  public class HandEvaluationTest 
{
public static void Test()
{

HandEvaluation.populate5Card();
HandEvaluation.populate9CardOmaha();

$ b $ Card [] player1cards = {新卡(4s),新卡(2s),新卡(8h),新卡(4d)} ;
卡[] player2cards = {新卡(As),新卡(9s),新卡(6c),新卡(2h)};
卡[] player3cards = {新卡(9h),新卡(7h),新卡(Kc),新卡(Kh)};
Card [] table = {新卡(2d),新卡(2c),新卡(3c),新卡(5c),新卡(4h)} ;


int j = 0,k = 0,l = 0;
long startTime = System.nanoTime();
for(int p = 0; p <100000000; p ++){
j = HandEvaluation.handEval9Hash(player1cards,table);
k = HandEvaluation.handEval9Hash(player2cards,table);
l = HandEvaluation.handEval9Hash(player3cards,table);

}
long estimatedTime = System.nanoTime() - startTime;
System.out.println(Time needed:+ estimatedTime * Math.pow(10,-6)+ms);
System.out.println(Handstrength Player 1:+ j);
System.out.println(Handstrength Player 2:+ k);
System.out.println(Handstrength Player 3:+ l);




$ b

大HashMap工作完成于HandEvaluation.populate9CardOmaha( )。 5张卡片很小。大代码:

  public static void populate9CardOmaha()
{

//检查hashmap是否已经存在 - 然后只读取它并退出
文件hashmap = new File(omaha.ser);
if(hashmap.exists())
{
try
{
File file = new File(omaha.ser);
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
omahaMap =(HashMap< Long,Integer>)s.readObject();
s.close();

catch(IOException ioex){ioex.printStackTrace();}
catch(ClassNotFoundException cnfex)
{
System.out.println(Class not found );
cnfex.printStackTrace();
return;
}
return;
}

//如果它不存在,请自行填充
...填充hashmap的代码...
//然后将其保存到文件


尝试
{
File file = new File(omaha.ser);
FileOutputStream f = new FileOutputStream(file);
ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
s.writeObject(omahaMap);
s.close();
}
catch(IOException ioex) {ioex.printStackTrace();}
}

当我自己填充它时文件不在这里),HandEvaluationTest.Test()中的查找需要大约8秒而不是3秒。也许这只是我测量时间的非常幼稚的方法?

解决方案

这个问题很有趣,所以我写了自己的测试用例来验证它。以速度进行实时查找与从序列化文件加载的一样。该程序可在任何有兴趣运行该程序的人员处发布。




  • 使用JProfiler监控方法。

  • 序列化文件与您的文件相当。 〜 230 MB

  • 无需任何序列化的内存查找成本 1210






  • 在序列化地图并再次读取地图之后,查找的成本保持不变(几乎是 - 1224 ms)



$ b


  • 分析器被调整为在两种情况下都增加了最小的开销。

  • Java™SE运行时环境(内部版本1.6.0_25-b06) / 4个以1.7 Ghz运行的CPU / 4GB Ram 800 Mhz



测量非常棘手。我自己注意到了你所描述的 8 second 查找时间,但是猜猜我发现了什么时候发生了什么。



GC活动





您的测量值可能也会选择。如果单独分析 Map.get()的度量值,您会发现结果具有可比性。




  public class GenericTest 
{
public static void main(String ... args)
{
//根据你的实际Vs ser调用方法< - > de_ser run
}

private static Map< Long,Integer> generateHashMap()
{
Map< Long,Integer> map = new HashMap< Long,Integer>();
最终随机random = new Random();
for(int counter = 0; counter< 10000000; counter ++)
{
final int value = random.nextInt();
final long key = random.nextLong();
map.put(key,value);
}
返回地图;

$ b $ private static void lookupItems(int n,Map< Long,Integer> map)
{
final Random random = new Random();
for(int counter = 0; counter< n; counter ++)
{
final long key = random.nextLong();
final整数值= map.get(key);



private static void serialize(Map< Long,Integer> map)
{
try
{
File file = new File(temp / omaha.ser);
FileOutputStream f = new FileOutputStream(file);
ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
s.writeObject(map);
s.close();

catch(Exception e)
{
e.printStackTrace();
}
}

private static Map< Long,Integer> deserialize()
{
尝试
{
文件file = new File(temp / omaha.ser);
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
HashMap< Long,Integer> map =(HashMap< Long,Integer>)s.readObject();
s.close();
返回地图;

catch(Exception e)
{
抛出new RuntimeException(e);
}
}
}


I have a pretty large Hashmap (~250MB). Creating it takes about 50-55 seconds, so I decided to serialize it and save it to a file. Reading from the file takes about 16-17 seconds now.

The only problem is that lookups seems to be slower this way. I always thought that the hashmap is read from the file into the memory, so the performance should be the same compared to the case when I create the hashmap myself, right? Here is the code I am using to read the hashmap into a file:

File file = new File("omaha.ser");
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
omahaMap = (HashMap<Long, Integer>) s.readObject();
s.close();

300 million lookups take about 3.1 seconds when I create the hashmap myself, and about 8.5 seconds when I read the same hashmap from file. Does anybody have an idea why? Am I overlooking something obvious?

EDIT:

I "measured" the time by just taking the time with System.nanotime(), so no proper benchmark method used. Here is the code:

public class HandEvaluationTest
{
    public static void Test()
    {

        HandEvaluation.populate5Card();
        HandEvaluation.populate9CardOmaha();


        Card[] player1cards = {new Card("4s"), new Card("2s"), new Card("8h"), new Card("4d")};
        Card[] player2cards = {new Card("As"), new Card("9s"), new Card("6c"), new Card("2h")};
        Card[] player3cards = {new Card("9h"), new Card("7h"), new Card("Kc"), new Card("Kh")};
        Card[] table = {new Card("2d"), new Card("2c"), new Card("3c"), new Card("5c"), new Card("4h")};


        int j=0, k=0, l=0;
        long startTime = System.nanoTime();
        for(int p=0; p<100000000; p++)    {
           j = HandEvaluation.handEval9Hash(player1cards, table);
            k = HandEvaluation.handEval9Hash(player2cards, table);
            l = HandEvaluation.handEval9Hash(player3cards, table);

        }
        long estimatedTime = System.nanoTime() - startTime;
        System.out.println("Time needed: " + estimatedTime*Math.pow(10,-6) + "ms");
        System.out.println("Handstrength Player 1: " + j);
        System.out.println("Handstrength Player 2: " + k);
        System.out.println("Handstrength Player 3: " + l);
    }
}

The big hashmap work is done in HandEvaluation.populate9CardOmaha(). The 5-card one is small. The code for the big one:

 public static void populate9CardOmaha()
        {

            //Check if the hashmap is already there- then just read it and exit
            File hashmap = new File("omaha.ser");
            if(hashmap.exists())
            {
                try
                {
                    File file = new File("omaha.ser");
                    FileInputStream f = new FileInputStream(file);
                    ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
                    omahaMap = (HashMap<Long, Integer>) s.readObject();
                    s.close();
                }
                catch(IOException ioex) {ioex.printStackTrace();}
                catch(ClassNotFoundException cnfex)
                {
                    System.out.println("Class not found");
                    cnfex.printStackTrace();
                    return;
                }
                return;
            }

    // if it's not there, populate it yourself
    ... Code for populating hashmap ...
    // and then save it to file
          (

            try
            {
                File file = new File("omaha.ser");
                FileOutputStream f = new FileOutputStream(file);
                ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
                s.writeObject(omahaMap);
                s.close();
            }
            catch(IOException ioex) {ioex.printStackTrace();}
        }

When i am populating it myself (= file is not here), lookups in the HandEvaluationTest.Test() take about 8 seconds instead of 3. Maybe it's just my very naive way of measuring the time elapsed?

解决方案

This question was interesting, so I wrote my own test case to verify it. I found no difference in speed for a live lookup Vs one that was loaded from a serialized file. The program is available at the end of the post for anyone interested in running it.

  • The methods were monitored using JProfiler.
  • The serialized file is comparable to yours. ~ 230 MB.
  • Lookups in memory cost 1210 ms without any serialization

  • After serializing the map and reading them again, the cost of lookups remained the same (well almost - 1224 ms)

  • The profiler was tweaked to add minimal overhead in both scenarios.
  • This was measured on Java(TM) SE Runtime Environment (build 1.6.0_25-b06) / 4 CPUs running at 1.7 Ghz / 4GB Ram 800 Mhz

Measuring is tricky. I myself noticed the 8 second lookup time that you described, but guess what else I noticed when that happened.

GC activity

Your measurements are probably picking that up too. If you isolate the measurements of Map.get() alone you'll see that the results are comparable.


public class GenericTest
{
    public static void main(String... args)
    {
        // Call the methods as you please for a live Vs ser <-> de_ser run
    }

    private static Map<Long, Integer> generateHashMap()
    {
        Map<Long, Integer> map = new HashMap<Long, Integer>();
        final Random random = new Random();
        for(int counter = 0 ; counter < 10000000 ; counter++)
        {
            final int value = random.nextInt();
            final long key = random.nextLong();
            map.put(key, value);
        }
        return map;
    }

    private static void lookupItems(int n, Map<Long, Integer> map)
    {
        final Random random = new Random();
        for(int counter = 0 ; counter < n ; counter++)
        {
            final long key = random.nextLong();
            final Integer value = map.get(key);
        }
    }

    private static void serialize(Map<Long, Integer> map)
    {
        try
        {
            File file = new File("temp/omaha.ser");
            FileOutputStream f = new FileOutputStream(file);
            ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
            s.writeObject(map);
            s.close();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

    private static Map<Long, Integer> deserialize()
    {
        try
        {
            File file = new File("temp/omaha.ser");
            FileInputStream f = new FileInputStream(file);
            ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
            HashMap<Long, Integer> map = (HashMap<Long, Integer>) s.readObject();
            s.close();
            return map;
        }
        catch (Exception e)
        {
            throw new RuntimeException(e);
        }
    }
}

这篇关于反序列化后Hashmap变慢 - 为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆