为什么元组([1,“a”,“b”,“c”,“z”,“f”)))== tuple(set([“a”,“ b“,”c“,”z“,”f“,1]))启用散列随机化的时间的85%? [英] Why does tuple(set([1,"a","b","c","z","f"])) == tuple(set(["a","b","c","z","f",1])) 85% of the time with hash randomization enabled?

查看:117
本文介绍了为什么元组([1,“a”,“b”,“c”,“z”,“f”)))== tuple(set([“a”,“ b“,”c“,”z“,”f“,1]))启用散列随机化的时间的85%?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

鉴于零比雷埃夫斯对另一个问题的回答,我们有这个

  x =元组(set([1,a,b,c,z,f]))
y =元组(set([a,b,c,z,f,1]))
print(x == y)

真实大约85%的时间.me / blog / archives / 2012/01/17 / use-random-hashing-if-you-care-about-security /rel =nofollow noreferrer> hash randomization enabled。为什么是85%?

解决方案

我会假设这个问题的任何读者都读过这两个:





首先要注意的是散列随机化决定于解释器启动。



每个字母的散列值对于两个集合都是相同的,所以唯一可能影响的是如果发生碰撞(其中order会受到影响)。






通过扣除第二个链接,我们知道这些集合的支持数组的长度从8开始:

  _ _ _ _ _ _ _ _ 

在第一种情况下,我们插入 1



< pre class =lang-none prettyprint-override> _ 1 _ _ _ _ _ _

,然后插入其余部分:

 α1? ? ? ? ? ? 

然后将它重新设置为32号:



< pre-class =lang-none prettyprint-override> 1不能与α发生碰撞,因为α是一个偶数哈希
↓所以1在插槽1中首先插入
? 1? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

在第二种情况下,我们插入其余的内容:

 ? β? ? ? ? ? ? 

然后尝试插入1:

 尝试在这里插入1,但是如果β存在

↓会被重新映射。 β? ? ? ? ? ?

然后它会被重新设定:

 尝试在这里插入1,但是如果β存在并且
↓没有在其他地方重新映射
,则会重新映射
? β? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

所以迭代次序是否不同只取决于β是否存在。





β的机会是5个字母中的任何一个将散列为1模8 散列为1的概率模32,因为任何哈希到1模32也哈希到1模8,我们想要找到32个插槽中的一个,其中的一个是在插槽1中:
$ b

  5(字母数)/ 32(插槽数)

5/32为0.15625,所以订单的机会¹有15.625%两套结构






完全不奇怪,这正是零比雷埃夫斯测量的结果。 p>




¹技术上这个并不明显。我们可以假设5个哈希中的每一个都是因为重新哈希,但是由于线性探测,它实际上更可能发生聚束结构......但是因为我们只关注是否占用一个时隙,所以它不会实际上并不影响我们。

Given Zero Piraeus' answer to another question, we have that

x = tuple(set([1, "a", "b", "c", "z", "f"]))
y = tuple(set(["a", "b", "c", "z", "f", 1]))
print(x == y)

Prints True about 85% of the time with hash randomization enabled. Why 85%?

解决方案

I'm going to assume any readers of this question to have read both:

The first thing to note is that hash randomization is decided on interpreter start-up.

The hash of each letter will be the same for both sets, so the only thing that can matter is if there is a collision (where order will be affected).


By the deductions of that second link we know the backing array for these sets starts at length 8:

_ _ _ _ _ _ _ _

In the first case, we insert 1:

_ 1 _ _ _ _ _ _

and then insert the rest:

α 1 ? ? ? ? ? ?

Then it is rehashed to size 32:

    1 can't collide with α as α is an even hash
  ↓ so 1 is inserted at slot 1 first
? 1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

In the second case, we insert the rest:

? β ? ? ? ? ? ?

And then try to insert 1:

    Try to insert 1 here, but will
  ↓ be rehashed if β exists
? β ? ? ? ? ? ?

And then it will be rehashed:

    Try to insert 1 here, but will
    be rehashed if β exists and has
  ↓ not rehashed somewhere else
? β ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

So whether the iteration orders are different depends solely on whether β exists.


The chance of a β is the chance that any of the 5 letters will hash to 1 modulo 8 and hash to 1 modulo 32.

Since anything that hashes to 1 modulo 32 also hashes to 1 modulo 8, we want to find the chance that of the 32 slots, one of the five is in slot 1:

5 (number of letters) / 32 (number of slots)

5/32 is 0.15625, so there is a 15.625% chance¹ of the orders being different between the two set constructions.


Not very strangely at all, this is exactly what Zero Piraeus measured.


¹Technically even this isn't obvious. We can pretend every one of the 5 hashes uniquely because of rehashing, but because of linear probing it's actually more likely for "bunched" structures to occur... but because we're only looking at whether a single slot is occupied, this doesn't actually affect us.

这篇关于为什么元组([1,“a”,“b”,“c”,“z”,“f”)))== tuple(set([“a”,“ b“,”c“,”z“,”f“,1]))启用散列随机化的时间的85%?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆