良好的字符串散列函数 [英] Good Hash Function for Strings
问题描述
我试图想出一个很好的字符串散列函数。我认为总结字符串中前五个字符的unicode值可能是个好主意(假设它有五个字符,否则停止它的结束位置)。这是一个好主意,还是一个坏的?
I'm trying to think up a good hash function for strings. And I was thinking it might be a good idea to sum up the unicode values for the first five characters in the string (assuming it has five, otherwise stop where it ends). Would that be a good idea, or is it a bad one?
我在Java中这样做,但我不会想象这会有很大的不同。
I am doing this in Java, but I wouldn't imagine that would make much of a difference.
推荐答案
通常哈希不会执行和,否则 stop
和 pots
将具有相同的散列。
Usually hashes wouldn't do sums, otherwise stop
and pots
will have the same hash.
并且您不会将其限制为前n个字符,房屋将具有相同的散列。
and you wouldn't limit it to the first n characters because otherwise house and houses would have the same hash.
通常,散列值取值并将其乘以一个素数(使其更容易产生独特的散列)。因此,您可以执行如下操作:
Generally hashs take values and multiply it by a prime number (makes it more likely to generate unique hashes) So you could do something like:
int hash = 7;
for (int i = 0; i < strlen; i++) {
hash = hash*31 + charAt(i);
}
这篇关于良好的字符串散列函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!