为什么元组([1,“a”,“b”,“c”,“z”,“f”)))== tuple(set([“a”,“ b“,”c“,”z“,”f“,1]))启用散列随机化的时间的85%? [英] Why does tuple(set([1,"a","b","c","z","f"])) == tuple(set(["a","b","c","z","f",1])) 85% of the time with hash randomization enabled?
问题描述
鉴于零比雷埃夫斯对另一个问题的回答,我们有这个
x =元组(set([1,a,b,c,z,f]))
y =元组(set([a,b,c,z,f,1]))
print(x == y)
用真实大约85%的时间.me / blog / archives / 2012/01/17 / use-random-hashing-if-you-care-about-security /rel =nofollow noreferrer> hash randomization enabled。为什么是85%?
我会假设这个问题的任何读者都读过这两个:
首先要注意的是散列随机化决定于解释器启动。
每个字母的散列值对于两个集合都是相同的,所以唯一可能影响的是如果发生碰撞(其中order会受到影响)。
通过扣除第二个链接,我们知道这些集合的支持数组的长度从8开始:
_ _ _ _ _ _ _ _
在第一种情况下,我们插入 1
:
< pre class =lang-none prettyprint-override>
_ 1 _ _ _ _ _ _
,然后插入其余部分:
α1? ? ? ? ? ?
然后将它重新设置为32号:
< pre-class =lang-none prettyprint-override>
1不能与α发生碰撞,因为α是一个偶数哈希
↓所以1在插槽1中首先插入
? 1? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
在第二种情况下,我们插入其余的内容:
? β? ? ? ? ? ?
然后尝试插入1:
尝试在这里插入1,但是如果β存在
,
↓会被重新映射。 β? ? ? ? ? ?
然后它会被重新设定:
尝试在这里插入1,但是如果β存在并且
↓没有在其他地方重新映射
,则会重新映射
? β? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
所以迭代次序是否不同只取决于β是否存在。
β的机会是5个字母中的任何一个将散列为1模8 和散列为1的概率模32,因为任何哈希到1模32也哈希到1模8,我们想要找到32个插槽中的一个,其中的一个是在插槽1中:
$ b
5(字母数)/ 32(插槽数)
5/32为0.15625,所以订单的机会¹有15.625%两套结构。
完全不奇怪,这正是零比雷埃夫斯测量的结果。 p>
¹技术上这个并不明显。我们可以假设5个哈希中的每一个都是因为重新哈希,但是由于线性探测,它实际上更可能发生聚束结构......但是因为我们只关注是否占用一个时隙,所以它不会实际上并不影响我们。 Given Zero Piraeus' answer to another question, we have that Prints I'm going to assume any readers of this question to have read both: The first thing to note is that hash randomization is decided on interpreter start-up. The hash of each letter will be the same for both sets, so the only thing that can matter is if there is a collision (where order will be affected). By the deductions of that second link we know the backing array for these sets starts at length 8: In the first case, we insert and then insert the rest: Then it is rehashed to size 32: In the second case, we insert the rest: And then try to insert 1: And then it will be rehashed: So whether the iteration orders are different depends solely on whether β exists. The chance of a β is the chance that any of the 5 letters will hash to 1 modulo 8 and hash to 1 modulo 32. Since anything that hashes to 1 modulo 32 also hashes to 1 modulo 8, we want to find the chance that of the 32 slots, one of the five is in slot 1: 5/32 is 0.15625, so there is a 15.625% chance¹ of the orders being different between the two set constructions. Not very strangely at all, this is exactly what Zero Piraeus measured. ¹Technically even this isn't obvious. We can pretend every one of the 5 hashes uniquely because of rehashing, but because of linear probing it's actually more likely for "bunched" structures to occur... but because we're only looking at whether a single slot is occupied, this doesn't actually affect us. 这篇关于为什么元组([1,“a”,“b”,“c”,“z”,“f”)))== tuple(set([“a”,“ b“,”c“,”z“,”f“,1]))启用散列随机化的时间的85%?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!x = tuple(set([1, "a", "b", "c", "z", "f"]))
y = tuple(set(["a", "b", "c", "z", "f", 1]))
print(x == y)
True
about 85% of the time with hash randomization enabled. Why 85%?
_ _ _ _ _ _ _ _
1
:_ 1 _ _ _ _ _ _
α 1 ? ? ? ? ? ?
1 can't collide with α as α is an even hash
↓ so 1 is inserted at slot 1 first
? 1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
? β ? ? ? ? ? ?
Try to insert 1 here, but will
↓ be rehashed if β exists
? β ? ? ? ? ? ?
Try to insert 1 here, but will
be rehashed if β exists and has
↓ not rehashed somewhere else
? β ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
5 (number of letters) / 32 (number of slots)