字典在C＃两个哈希函数？ [英] Dictionary with two hash functions in C#?

查看：149 发布时间：2016/9/28 12:55:44 c# hash dictionary

本文介绍了字典在C＃两个哈希函数？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有条目的巨大（>> 10米）名单。每个条目提供了两个哈希函数：

便宜：快速计算哈希值，但它的分布是可怕的（可以把项目的99％，以1％哈希空间）

贵：需要花费大量的时间来计算，但分布要好很多也

这是普通的字典让我只使用其中的哈希函数。我想，首先使用廉价的散列函数，并检查相撞昂贵的一个解释。

这似乎是一个好主意，用dictionory内的字典为了这。我现在基本上都用这个畸形：

 词典< INT，字典< INT，列表与LT;富>>> ;;

我改进了这一设计，使昂贵的哈希被称为只有当实际有相同便宜的两个项目哈希值。

这完全符合并做了完美的工作对我来说，但它看起来喜欢的事，应该在6500万年前去世了。

据我所知，此功能不包括在基本框架。我要写一DoubleHashedDictionary类，但我想知道你的意见首先

至于我的具体情况：结果
首先哈希函数=在一个文件系统目录中的文件的数目（快）
第二散列函数=文件（慢）

编辑的尺寸的总和：

改变标题和增加了更多的信息。

增加了相当重要的缺失细节

解决方案

首先，我认为你是在正确的道路上实现自己的哈希表，如果你是描述被忠实地desired.But作为一个评论家，我想问几个问题：

你有没有使用的东西更独特的每个条目考虑的。

我假设每个条目是一个文件系统的目录信息，你有没有考虑过使用其完整路径为重点？计算机名/ IP地址前缀？

在另一方面，如果你使用的文件哈希键的数量，是这些目录永远不会改变？因为如果散列键/结果的变化，你将永远无法再找到它。

虽然关于这个主题，如果目录内容/大小是永远不会改变，你能存储价值的地方，以节省实际计算出的时间呢？

只是我的几美分。

I've got a huge (>>10m) list of entries. Each entry offers two hash functions:

Cheap: quickly computes hash, but its distribution is terrible (may put 99% of items in 1% of hash space)
Expensive: takes a lot of time to compute, but the distribution is a lot better also

An ordinary Dictionary lets me use only one of these hash functions. I'd like a Dictionary that uses the cheap hash function first, and checks the expensive one on collisions.

It seems like a good idea to use a dictionary inside a dictionory for this. I currently basically use this monstrosity:

Dictionary<int, Dictionary<int, List<Foo>>>;

I improved this design so the expensive hash gets called only if there are actually two items of the same cheap hash.

It fits perfectly and does a flawless job for me, but it looks like something that should have died 65 million years ago.

To my knowledge, this functionality is not included in the basic framework. I am about to write a DoubleHashedDictionary class but I wanted to know of your opinion first.

As for my specific case:
First hash function = number of files in a file system directory (fast) Second hash function = sum of size of files (slow)

Edits:

Changed title and added more informations.
Added quite important missing detail

解决方案

First off, I think you're on the right path to implement your own hashtable, if what you are describing is truely desired.But as a critic, I'd like to ask a few questions:

Have you considered using something more unique for each entry?

I am assuming that each entry is a file system directory information, have you considered using its full path as key? prefixing with computer name/ip address?

On the other hand, if you're using number of files as hash key, are those directories never going to change? Because if the hash key/result changes, you will never be able to find it again.

While on this topic, if the directory content/size is never going to change, can you store that value somewhere to save the time to actually calculate that?

Just my few cents.

这篇关于字典在C＃两个哈希函数？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

字典在C＃两个哈希函数？ [英] Dictionary with two hash functions in C#?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

字典在C＃两个哈希函数？ [英] Dictionary with two hash functions in C#?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭