良好的字符串散列函数 [英] Good Hash Function for Strings
问题描述
我正在尝试为字符串想出一个好的散列函数.我认为将字符串中前五个字符的 unicode 值相加可能是一个好主意(假设它有五个,否则在它结束的地方停止).这是个好主意还是坏主意?
I'm trying to think up a good hash function for strings. And I was thinking it might be a good idea to sum up the unicode values for the first five characters in the string (assuming it has five, otherwise stop where it ends). Would that be a good idea, or is it a bad one?
我正在用 Java 做这件事,但我认为这不会有太大的不同.
I am doing this in Java, but I wouldn't imagine that would make much of a difference.
推荐答案
通常哈希不会求和,否则 stop
和 pots
将具有相同的哈希.
Usually hashes wouldn't do sums, otherwise stop
and pots
will have the same hash.
并且您不会将其限制为前 n 个字符,否则house 和houses 将具有相同的哈希值.
and you wouldn't limit it to the first n characters because otherwise house and houses would have the same hash.
通常哈希取值并将其乘以质数(使其更有可能生成唯一的哈希)因此您可以执行以下操作:
Generally hashs take values and multiply it by a prime number (makes it more likely to generate unique hashes) So you could do something like:
int hash = 7;
for (int i = 0; i < strlen; i++) {
hash = hash*31 + charAt(i);
}
这篇关于良好的字符串散列函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!