Java:HashSet与HashMap [英] Java : HashSet vs. HashMap

查看:181
本文介绍了Java:HashSet与HashMap的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个处理大量数据集的程序。这些对象最好存储在散列实现的容器中,因为程序不断寻找容器中的对象。



第一个想法是使用HashMap,因为方法获取和删除容器更适合我需要的用途。

但是,我来看看HashMap的使用很可能是内存消耗这是一个主要问题,所以我认为切换到HashSet会更好,因为它只使用< E> ,而不是< K,V> 但是当我看到实现时,我知道它使用了一个底层的HashMap!这意味着它不会保存任何内存!



所以这是我的问题:


  • 我的假设是否都是真的?

  • HashMap内存是否浪费?更具体地说,每个条目的开销是多少?

  • HashSet和HashMap一样浪费吗?

  • 是否有其他基于哈希容器将会显着减少内存消耗品?

    更新



按照评论中的要求,我会在我的程序中扩展一些,hashMap是为了容纳一对其他对象,以及一些数值 - 从他们算出的浮点数。一路上它提取一些并输入新的对。鉴于一对,它需要确保它不持有这一对或删除它。映射可以使用float对象或者对象的 hashCode 来完成。



另外当我说巨大的数据集我正在谈论〜4 * 10 ^ 9个对象 解决方案

这个网站关于java中的集合性能。


HashSet 建立在 HashMap< T,Object> ,其中value是一个
单例'present'对象。这意味着 aHashSet的内存消耗与HashMap 相同:为了存储 SIZE 值,您需要 32 * SIZE + 4 * CAPACITY 字节(加上您的值的大小)。这绝对不是一个内存友好的集合。



THashSet 可能是 HashSet 的最简单替换集合 - 它实现Set和Iterable,这意味着您应该只需更新你的集合中的单个字母即可。


$ b $ THashSet 使用单个对象数组作为其值,所以它使用 4 * CAPACITY 字节进行存储。正如您所看到的,与JDK HashSet相比,如果负载因子相同,您将保存 <32> SIZE 个字节,这是一个巨大的改进。


以下图片是我从这里可以帮助我们记住选择正确的集合。




I have a program working on enormous data sets. The objects are best stored on hash implemented containers since the program keeps seeking for objects in the container.

The first idea was to use HashMap since the methods get and remove of this container are more suitable to the uses I need.

But, I came to see the use of HashMap is pretty memory consumable which is a major problem, so i thought switching to HashSet will be better because it only uses <E>, and not <K,V> per element, but when I looked at the implementation i learned it uses an underlying HashMap! this means it wont save any memory!

So this is my questions:

  • Are all my assumptions true?
  • Is HashMap memory wasteful? more specifically, what is its overhead for each entry?
  • Is HashSet just as wasteful as HashMap?
  • Is there any other Hash based containers which will be significantly less memory consumables?

    update

As requested in the comments I will extend a bit on my program, the hashMap is meant to hold a pair of other objects, and some numeric value - a float- calculated from them. along the way it extracts some of them and enters new pairs. Given a pair it needs to ensure it doesnt hold this pair or to remove it. The mapping can be done using the float value or the hashCode of the pair object.

Additionally when i say "enormous data sets" I am talking about ~ 4*10^9 objects

解决方案

There are very useful tips on this site about collections performance in java.

HashSet is built on top of a HashMap< T, Object >, where value is a singleton ‘present’ object. It means that the memory consumption of aHashSet is identical to HashMap: in order to store SIZE values, you need 32 * SIZE + 4 * CAPACITY bytes (plus size of your values). It is definitely not a memory-friendly collection.

THashSet could be the easiest replacement collection for a HashSet – it implements Set and Iterable, which means you should just update a single letter in the initialization of your set.

THashSet uses a single object array for its values, so it uses 4 * CAPACITY bytes for storage. As you can see, compared to JDK HashSet, you will save 32 * SIZE bytes in case of the identical load factor, which is a huge improvement.

Also the below image which I took from here can help us keeping something in mind for choosing right collection

这篇关于Java:HashSet与HashMap的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆