通过哈希值比较长字符串 [英] Comparing long strings by their hashes
问题描述
为了提高比较字符串的函数的性能,我决定通过比较它们的哈希值来比较它们.那么是否可以保证 2 个非常长的字符串的哈希值彼此相等,那么这些字符串也彼此相等?
Trying to improve the performance of a function that compares strings I decided to compare them by comparing their hashes. So is there a guarantee if the hash of 2 very long strings are equal to each other then the strings are also equal to each other?
推荐答案
虽然可以保证 2 个相同的字符串会给你相同的哈希值,但反过来不是这样:对于给定的哈希值,总是有几个可能的字符串产生相同的哈希.由于 PigeonHole 原则,这是正确的.
While it's guaranteed that 2 identical strings will give you equal hashes, the other way round is not true : for a given hash, there are always several possible strings which produce the same hash. This is true due to the PigeonHole principle.
话虽如此,2 个不同的字符串产生相同哈希的机会可以变得非常小,以至于被认为等同于 null.
That being said, the chances of 2 different strings producing the same hash can be made infinitesimal, to the point of being considered equivalent to null.
这种散列的一个相当经典的例子是 MD5,它具有近乎完美的 128 位分布.这意味着您在 2^128 中有一次机会 2 个不同的字符串产生相同的哈希值.嗯,基本上,几乎和不可能一样.
A fairly classical example of such hash is MD5, which has a near perfect 128 bits distribution. Which means that you have one chance in 2^128 that 2 different strings produce the same hash. Well, basically, almost the same as impossible.
这篇关于通过哈希值比较长字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!