写一个哈希函数需要一些方向来排序约160,000个字符串 [英] I need some direction on writing a Hash Function to sort ~160,000 strings

查看:147
本文介绍了写一个哈希函数需要一些方向来排序约160,000个字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的导师把这个抛给我们,告诉我们,我们只需要google如何写一个哈希函数。我对此非常无礼。我们为课程写了一个基本的哈希表模板,但是我有一个项目,因为需要〜160,000个字符串被排序到一个至少有500个桶的表中(我想要做更多的速度)。

My instructor dumped this on us, and told us we just needed to google how to write a hash function. I am quite directionless on this. We wrote a basic Hash Table template for class, but I have a project due that requires ~160,000 strings to be sorted into a table with at least 500 buckets (I am wanting to do more for speed).

我不知道在哪里找到简明易懂的信息。

I just have no idea where to look to get concise, easily digestible information on this.

任何帮助将不胜感激。 p>

Any help would be greatly appreciated.

推荐答案

我建议一个通用散列函数。这种功能保证了少量的预期碰撞,即使数据是由对手选择的。有很多通用散列函数。

I suggest a universal hash function. This kind of function guarantees a small number of collisions in expectation, even if the data is chosen by an adversary. There are plenty of universal hash functions.

对于字符串,您可以采用以下散列函数。

In case of strings, you can adopt the following hash function.

对于字符 c 我们定义 c(c)的算术值ie(ASCII)。对于一个字符串 x = c1c1 ... cn 我们定义

For a character c we define #(c) the arithmetic value of c ie(ASCII). For a string x=c1c1...cn we define

如果 HSize 是一个整数,并且一个大素数(您定义它)​​,范围 0 让哈希函数为:

If HSize is an integer and k a big prime number (you define it), for a range 0<a,b<k*HSizelet the hash function be:

此函数提供 [0,HSize-1]

输出值由horner的规则计算,每次操作后找到 k * HSize 的模数。

The output value is calculated by horner's rule, finding the modulo of the k*HSize division after every operation.

所以,创建一个函数HashFunction ,并传递你想要的hash作为参数的字符串。
这里是代码:

So, create a function HashFunction and pass the string you want to hash as a parameter. Here is the code:

#define k 7919 
#define Hsize 1009   
#define a 321
#define b 43112

long long HashFunction(string text)
{
  int i;
  long long  res = 0;
  long long M = (Hsize * k);
  cout<<"M = "<<M<<endl;
  cout<<"Hsize = "<<Hsize<<endl;
  cout<<"k = "<<k<<endl;
  int s=text.size();
  for(i = s-1; i >= 0; i--)
  {
    res = a * (res * 256 + (int)text[i]);
    //cout<<"res before modulo = "<<res<<endl;
    res=res % M;
    //cout<<"res after modulo = "<<res<<endl;
  }
    long long res1 = (res + b) / k;
    return res1;
}

这篇关于写一个哈希函数需要一些方向来排序约160,000个字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆