什么是对英文单词好的哈希函数? [英] What's a good hash function for English words?

查看:518
本文介绍了什么是对英文单词好的哈希函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的英文单词一个长长的清单,我想他们的哈希。这将是一个很好的哈希函数?到目前为止,我的哈希函数总结字母的ASCII值,则模表的大小。我在寻找的东西简单有效的。

I have a long list of English words and I would like to hash them. What would be a good hashing function? So far my hashing function sums the ASCII values of the letters then modulo the table size. I'm looking for something efficient and simple.

推荐答案

要简单地总结字母,因为置换给出了相同的结果不是一个好的策略。

To simply sum the letters is not a good strategy because a permutation gives the same result.

此一( djb2 )颇为流行,并与ASCII字符串很好地工作。

This one (djb2) is quite popular and works nicely with ASCII strings.

unsigned long hashstring(unsigned char *str)
{
    unsigned long hash = 5381;
    int c;

    while (c = *str++)
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */

    return hash;
}

如果您需要更多的选择和一些性能比较的措施,这里阅读

If you need more alternatives and some perfomance measures, read here.

补充:的这些的一般的哈希函数,其中输入域事先并不知道(也许除了一些很一般的假设:用ASCII码如上述作品略胜一筹输入),它是最常用的场景。如果你有一个已知的限制域(固定投入集),你可以做的更好,看菲昂的回答。

Added: These are general hashing functions, where the input domain is not known in advance (except perhaps some very general assumptions: eg the above works slightly better with ascii input), which is the most usual scenario. If you have a known restricted domain (set of inputs fixed) you can do better, see Fionn's answer.

这篇关于什么是对英文单词好的哈希函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆