得到一个重复值的概率对字符串调用GetHash code()的时候 [英] Probability of getting a duplicate value when calling GetHashCode() on strings

查看:72
本文介绍了得到一个重复值的概率对字符串调用GetHash code()的时候的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道什么是调用GetHash code()的字符串实例方法时得到重复值的概率。例如,根据:

I want to know what is the probability of getting duplicate values when calling the GetHashCode() method on string instances. For instance, according to:

<一个href=\"http://ashishkhandelwal.arkutil.com/index.php/csharp/gethash$c$c-method-hash$c$c-and-hashtable-behavior-deep-level/\">http://ashishkhandelwal.arkutil.com/index.php/csharp/gethash$c$c-method-hash$c$c-and-hashtable-behavior-deep-level/

雅和brainlessness具有相同的哈希code:175803953(x86计算机上)

"blair" and "brainlessness" has the same Hash Code: 175803953 (on an x86 machine).

感谢。

推荐答案

大。

(对不起乔恩!)

越来越短串中的散列碰撞的概率的非常大的。鉴于从常用词抽取一组只有10000不同的短字符串的,那里是在该组至少一个冲突的概率约为1%。如果你有八万串,还有是至少一种冲突的概率为50%以上。

The probability of getting a hash collision among short strings is extremely large. Given a set of only ten thousand distinct short strings drawn from common words, the probability of there being at least one collision in the set is approximately 1%. If you have eighty thousand strings, the probability of there being at least one collision is over 50%.

有关显示设置的大小和碰撞概率之间的关系的曲线图,看到我对这个问题的文章:

For a graph showing the relationship between set size and probability of collision, see my article on the subject:

<一个href=\"http://blogs.msdn.com/b/ericlippert/archive/2010/03/22/socks-birthdays-and-hash-collisions.aspx\">http://blogs.msdn.com/b/ericlippert/archive/2010/03/22/socks-birthdays-and-hash-collisions.aspx

这篇关于得到一个重复值的概率对字符串调用GetHash code()的时候的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆