这是更快,哈希查找或二进制搜索? [英] Which is faster, Hash lookup or Binary search?

查看:135
本文介绍了这是更快,哈希查找或二进制搜索?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在给定的一组静态的对象(静态的,即一旦加载它很少,如果都没有改变),其中多次是需要具有最佳性能,这是更好的并发查询,一个的HashMap 或使用一些自定义的比较与二进制搜索的数组?

When given a static set of objects (static in the sense that once loaded it seldom if ever changes) into which repeated concurrent lookups are needed with optimal performance, which is better, a HashMap or an array with a binary search using some custom comparator?

时的答案对象或结构类型的功能?哈希和/或同等功能的表现?哈希的独特性?列表的大小? 的Hashset 尺寸/集大小?

Is the answer a function of object or struct type? Hash and/or Equal function performance? Hash uniqueness? List size? Hashset size/set size?

集合的大小,我看可以随时随地从500K至10μm - 柜面的信息是有益的。

The size of the set that I'm looking at can be anywhere from 500k to 10m - incase that information is useful.

虽然我在寻找一个C#的答案,我认为真正的数学答案不在于语言,所以我不包括标记。然而,如果存在的C#具体的东西要注意的,该信息是需要的。

While I'm looking for a C# answer, I think the true mathematical answer lies not in the language, so I'm not including that tag. However, if there are C# specific things to be aware of, that information is desired.

推荐答案

好吧,我会尽量简短。

C#简短的回答:

测试两种不同的方法。

.NET为您提供了工具来改变你的方法带线code。 否则,使用System.Collections.Generic.Dictionary,一定要与大量的初始容量初始化它,否则你会你的生活中插入项目的其余部分通过,由于GC有做收集旧桶阵列的工作。

.NET gives you the tools to change your approach with a line of code. Otherwise use System.Collections.Generic.Dictionary and be sure to initialize it with a large number as initial capacity or you'll pass the rest of your life inserting items due to the job GC has to do to collect old bucket arrays.

再回应:

这是哈希表几乎不断查找时间和获取到项目中的哈希表中的现实世界并不仅仅需要计算一个哈希值。

An hashtable has ALMOST constant lookup times and getting to an item in an hash table in the real world does not just require to compute an hash.

要获得某个项目,你哈希表会做这样的事情:

To get to an item, your hashtable will do something like this:

  • 在拿到钥匙
  • 的哈希
  • 获取桶数散列(通常是地图功能看起来像这样水桶=哈希%bucketsCount)
  • 遍历项链(基本上是共享的项目列表 同一个桶中,最哈希表使用 处理桶/散列这种方法 碰撞),在该开始 桶并比较每个键与 您要的项目之一 添加/删除/更新/检查 载。
  • Get the hash of the key
  • Get the bucket number for that hash (usually the map function looks like this bucket = hash % bucketsCount)
  • Traverse the items chain (basically it's a list of items that share the same bucket, most hashtables use this method of handling bucket/hash collisions) that starts at that bucket and compare each key with the one of the item you are trying to add/delete/update/check if contained.

查找时间取决于如何好(如何稀疏是输出)和快速是你的哈希函数,水桶使用的是和速度是关键比较器,它并不总是最好的解决方案的数量。

Lookup times depend on how "good" (how sparse is the output) and fast is your hash function, the number of buckets you are using and how fast is the keys comparer, it's not always the best solution.

有一个更好的,更深入的解释: http://en.wikipedia.org/wiki/Hash_table

A better and deeper explanation: http://en.wikipedia.org/wiki/Hash_table

这篇关于这是更快,哈希查找或二进制搜索?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆