哈希查找和二分查找哪个更快? [英] Which is faster, Hash lookup or Binary search?

查看:213
本文介绍了哈希查找和二分查找哪个更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当给定一组静态对象(静态,即一旦加载它很少更改,就需要重复并发查找以实现最佳性能,这是更好的,HashMap 或使用一些自定义比较器进行二分搜索的数组?

答案是对象类型还是结构类型的函数?Hash 和/或 Equal 函数性能?哈希唯一性?列表大小?Hashset 大小/集合大小?

我正在查看的集合大小可以从 500k 到 10m 不等 - 以防这些信息有用.

虽然我正在寻找 C# 答案,但我认为真正的数学答案不在于语言,所以我不包括该标签.但是,如果有需要注意的 C# 特定事项,则需要该信息.

解决方案

好的,我会尽量简短.

C# 简答:

测试两种不同的方法.

.NET 为您提供了通过一行代码改变方法的工具.否则,请使用 System.Collections.Generic.Dictionary 并确保将其初始化为大数字作为初始容量,否则由于 GC 必须执行的工作来收集旧的存储桶数组,因此您将在余生中插入项目.

更长的答案:

哈希表具有几乎恒定的查找时间,并且在现实世界中访问哈希表中的项目不仅需要计算哈希值.

要访问一个项目,您的哈希表将执行以下操作:

  • 获取密钥的哈希值
  • 获取该哈希的桶号(通常映射函数看起来像这个桶 = hash % bucketsCount)
  • 遍历项目链(基本上它是共享项目的列表同一个桶,大多数哈希表使用这种处理桶/哈希的方法碰撞)从那开始存储桶并将每个键与您正在尝试的项目之一添加/删除/更新/检查是否

查找时间取决于您的哈希函数有多好"(输出有多稀疏)和速度有多快、您使用的存储桶数量以及键比较器的速度有多快,这并不总是最好的解决方案.

更好更深入的解释:http://en.wikipedia.org/wiki/Hash_table

When given a static set of objects (static in the sense that once loaded it seldom if ever changes) into which repeated concurrent lookups are needed with optimal performance, which is better, a HashMap or an array with a binary search using some custom comparator?

Is the answer a function of object or struct type? Hash and/or Equal function performance? Hash uniqueness? List size? Hashset size/set size?

The size of the set that I'm looking at can be anywhere from 500k to 10m - incase that information is useful.

While I'm looking for a C# answer, I think the true mathematical answer lies not in the language, so I'm not including that tag. However, if there are C# specific things to be aware of, that information is desired.

解决方案

Ok, I'll try to be short.

C# short answer:

Test the two different approaches.

.NET gives you the tools to change your approach with a line of code. Otherwise use System.Collections.Generic.Dictionary and be sure to initialize it with a large number as initial capacity or you'll pass the rest of your life inserting items due to the job GC has to do to collect old bucket arrays.

Longer answer:

An hashtable has ALMOST constant lookup times and getting to an item in an hash table in the real world does not just require to compute an hash.

To get to an item, your hashtable will do something like this:

  • Get the hash of the key
  • Get the bucket number for that hash (usually the map function looks like this bucket = hash % bucketsCount)
  • Traverse the items chain (basically it's a list of items that share the same bucket, most hashtables use this method of handling bucket/hash collisions) that starts at that bucket and compare each key with the one of the item you are trying to add/delete/update/check if contained.

Lookup times depend on how "good" (how sparse is the output) and fast is your hash function, the number of buckets you are using and how fast is the keys comparer, it's not always the best solution.

A better and deeper explanation: http://en.wikipedia.org/wiki/Hash_table

这篇关于哈希查找和二分查找哪个更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆