最快的Java HashSet< Integer>图书馆 [英] Fastest Java HashSet<Integer> library

查看:90
本文介绍了最快的Java HashSet< Integer>图书馆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

除了这篇相当老的帖子,我需要一些将使用原语并为包含很多IntegersHashSet的应用程序提速的东西:

In addition to this quite old post, I need something that will use primitives and give a speedup for an application that contains lots of HashSets of Integers:

Set<Integer> set = new HashSet<Integer>();

因此人们提到了诸如Guava,Javalution,Trover之类的库,但是就基准和性能结果而言,它们之间没有完美的比较,或者至少从良好的经验中获得了良好的答案.从我看来,很多人都推荐Trove的TIntHashSet,但是其他人则说它并不那么好.有人说Guava是超酷且易于管理的,但是我不需要美观和可维护性,只需要时间执行即可,因此Python风格的Guava可以实现了:) Javalution吗?我访问过该网站,对于我来说似乎太老了,因此很古怪.

So people mention libraries like Guava, Javalution, Trove, but there is no perfect comparison of those in terms of benchmarks and performance results, or at least good answer coming from good experience. From what I see many recommend Trove's TIntHashSet, but others say it is not that good; some say Guava is supercool and manageable, but I do not need beauty and maintainability, only time execution, so Python's style Guava goes home :) Javalution? I've visited the website, seems too old for me and thus wacky.

该库应提供可实现的最佳时间,内存无关紧要.

The library should provide the best achievable time, memory does not matter.

看用Java思考",有一个想法是用int[]作为键来创建自定义的HashMap.所以我想用HashSet看到类似的东西,或者只是下载并使用一个很棒的库.

Looking at "Thinking in Java", there is an idea of creating custom HashMap with int[] as keys. So I would like to see something similar with a HashSet or simply download and use an amazing library.

编辑(针对以下评论) 因此,在我的项目中,我从大约50个HashSet<Integer>集合开始,然后调用一个函数,该函数在内部创建了多达10个HashSet<Integer>集合的次数大约为1000次.如果更改初始参数,数字可能会成倍增长.我只在这些集合上使用add()contains()clear()方法,这就是为什么选择它们的原因.

EDIT (in response to the comments below) So in my project I start from about 50 HashSet<Integer> collections, then I call a function about 1000 times that inside creates up to 10 HashSet<Integer> collections. If I change initial parameters, the numbers may grow up exponentially. I only use add(), contains() and clear() methods on those collections, that is why they were chosen.

现在,我将要找到一个实现HashSet或类似功能的库,但是由于自动装箱Integer的开销以及其他我不知道的东西,它会更快地实现.实际上,我在输入数据时使用的是整数,并将它们存储在这些HashSet中.

Now I'm going to find a library that implements HashSet or something similar, but will do that faster due to autoboxing Integer overhead and maybe something else which I do not know. In fact, I'm using ints as my data comes in and store them in those HashSets.

推荐答案

在创建HashSet时是否尝试使用初始容量和负载因子参数?

Have you tried working with the initial capacity and load factor parameters while creating your HashSet?

HashSet doc

初始容量是指创建空哈希集时的大小,而loadfactor是确定何时增长哈希表的阈值.通常,您希望将已用存储桶与总存储桶之间的比率保持在三分之二以下,这被认为是在哈希表中实现良好稳定性能的最佳比率.

Initial capacity, as you might think, refers to how big will the empty hashset be when created, and loadfactor is a threshhold that determines when to grow the hash table. Normally you would like to keep the ratio between used buckets and total buckets, below two thirds, which is regarded as the best ratio to achieve good stable performance in a hash table.

哈希表的动态重设

因此,基本上,请尝试设置适合您需求的初始容量(避免在哈希表增长时重新创建和重新分配其值),以及摆弄加载因子,直到找到最佳位置为止

So basically, try to set an initial capacity that will fit your needs (to avoid re-creating and reassigning the values of a hash table when it grows), as well as fiddling with the load factor until you find a sweet spot.

对于您的特定数据分配和设置/获取值来说,较低的负载系数可能会有所帮助(几乎不会更高,但您的里程可能会有所不同).

It might be that for your particular data distribution and setting/getting values, a lower loadfactor could help (hardly a higher one will, but your milage may vary).

这篇关于最快的Java HashSet&lt; Integer&gt;图书馆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆