是否有可能使在这种情况下一个最小完美哈希函数? [英] Is it possible to make a minimal perfect hash function in this situation?

查看:198
本文介绍了是否有可能使在这种情况下一个最小完美哈希函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个哈希表(或其他结构,如果您有任何意见)来存储键值对。钥匙将一次全部被插入在为创建地图的同时,但我不知道是什么键会(任意长度的字符串),直到运行时,当我需要创建地图。

I want to create a Hash Map (or another structure, if you have any suggestions) to store key value pairs. The keys will all be inserted at once at the same time as the map is created, but I don't know what the keys will be (arbitrary length strings) until runtime, when I need to create the map.

我解析查询字符串这样的X = 100安培; NAME =鲍勃和放大器;的color = red&安培; Y = 150(但字符串可以有无限数量的变量和变量可具有任何长度的名称)。

I am parsing a query string like this "x=100&name=bob&color=red&y=150" (but the string can have an unlimited number of variables and the variables can have any length name).

我想一次解析,并创建一个哈希表,preferably最小的,一个完美的哈希函数满足线性的存储需求。一旦地图创建的值将不会被修改或删除,没有更多关键值对将被添加到地图或者,所以整个地图是有效的常数。我假设变量不会在字符串中出现两次(即X = 1&安培; X = 2。无效)。

I want to parse it once and create a Hash Map, preferably minimal and with a perfect hash function to satisfy linear storage requirements. Once the map is created the values won't be modified or deleted, no more key value pairs will be added to the map either, so the entire map is effectively a constant. I'm assuming that a variable doesn't occur twice in the string (IE. "x=1&x=2" is not valid).

我编码在 C ,目前有我可以使用像得到一个功能(X)这将返回字符串100,但每一个需要 O时间解析查询字符串(N)时间。我想一旦解析它,当它第一次加载,因为它是一个非常大的查询字符串,每个值将被读几遍。即使我使用 C ,我不需要在 C code作为一个答案。伪code,有什么建​​议或都将是真棒!

I am coding in C, and currently have a function that I can use like get("x") which will return the string "100", but it parses the query string each time which takes O(n) time. I'd like to parse it once when it is first loaded since it is a very large query string and every value will be read several times. Even though I'm using C, I don't need code in C as an answer. Pseudocode, or any suggestions at all would be awesome!

推荐答案

尝试使用GPL 的gperf ,或< A HREF =htt​​p://burtleburtle.net/bob/hash/perfect.html>用C 鲍勃·詹金斯的公共领域实施

Try GPL'd gperf, or Bob Jenkins' public domain implementation in C

步骤:


  • 接收查询字符串,并通过枚举密钥列表中找出完美的散列函数域

  • receive query string and identify domain of perfect hash function by enumerating the list of keys

提供这些键和列表大小(范围将1..size)从上面的参考实现派生的完美哈希生成函数

provide these keys and list size (the range will be 1..size) to the perfect hash generation function derived from above reference implementations

使用生成的创建HashMap中的完美哈希函数

Use the perfect hash function generated to create the HashMap

使用同样完美的散列函数来处理HashMap中的 GET 要求

Use the same perfect hash function to process the get requests in the HashMap

修改 Necrolis在下面的评论指出,在C源$ C ​​$ C参考实现完美的输出散列函数,所以你需要对它们进行修改以产生类似于一个字节code代表一个虚拟机来代替。你也可以使用一个跨pretative语言如嵌入式方案或Lua语言。

Edit Necrolis noted in the comment below that the reference implementations output perfect hash functions in C source code, so you'll need to modify them to generate something like a bytecode for a VM instead. You could also use an interpretative language like embedded Scheme or Lua.

这将是有趣的,知道这是值得努力当上创造完美的散列函数的开销平摊在查找一个简单的(非完美)的HashMap

It would be interesting to know if this is worth the effort over a simple (non-perfect) HashMap when the overhead of creating the perfect hash function is amortized over the lookups

另一个选择是杜鹃哈希其中也有O(1)查找

Another option is Cuckoo hashing which also has O(1) lookups

这篇关于是否有可能使在这种情况下一个最小完美哈希函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆