为什么元组（[1，“a”，“b”，“c”，“z”，“f”）））== tuple（set（[“a”，“ b“，”c“，”z“，”f“，1]））启用散列随机化的时间的85％？ [英] Why does tuple(set([1,"a","b","c","z","f"])) == tuple(set(["a","b","c","z","f",1])) 85% of the time with hash randomization enabled?

查看：117 发布时间：2018/6/1 18:23:14 python hash set cpython python-internals

本文介绍了为什么元组（[1，“a”，“b”，“c”，“z”，“f”）））== tuple（set（[“a”，“ b“，”c“，”z“，”f“，1]））启用散列随机化的时间的85％？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

鉴于零比雷埃夫斯对另一个问题的回答，我们有这个

  x =元组（set（[1，a，b，c，z，f]））
y =元组（set（[a，b，c，z，f，1]））
 print（x == y）

用真实大约85％的时间.me / blog / archives / 2012/01/17 / use-random-hashing-if-you-care-about-security /rel =nofollow noreferrer> hash randomization enabled。为什么是85％？

解决方案

我会假设这个问题的任何读者都读过这两个：

Zero Piraeus的答案和

我对CPython词典的解释

首先要注意的是散列随机化决定于解释器启动。

每个字母的散列值对于两个集合都是相同的，所以唯一可能影响的是如果发生碰撞（其中order会受到影响）。

通过扣除第二个链接，我们知道这些集合的支持数组的长度从8开始：

  _ _ _ _ _ _ _ _

在第一种情况下，我们插入 1 ：

< pre class =lang-none prettyprint-override>

 _ 1 _ _ _ _ _ _

，然后插入其余部分：

 α1？ ？ ？ ？ ？ ？

然后将它重新设置为32号：

< pre-class =lang-none prettyprint-override> 1不能与α发生碰撞，因为α是一个偶数哈希 ↓所以1在插槽1中首先插入？ 1？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？
在第二种情况下，我们插入其余的内容：
？ β？？？？？？
然后尝试插入1：
尝试在这里插入1，但是如果β存在， ↓会被重新映射。 β？？？？？？
然后它会被重新设定：

尝试在这里插入1，但是如果β存在并且 ↓没有在其他地方重新映射，则会重新映射？ β？？？？？？？？？？？？？？？？？？？？？？？？？？？？？？
所以迭代次序是否不同只取决于β是否存在。

β的机会是5个字母中的任何一个将散列为1模8 和散列为1的概率模32，因为任何哈希到1模32也哈希到1模8，我们想要找到32个插槽中的一个，其中的一个是在插槽1中：
$ b
5（字母数）/ 32（插槽数）
5/32为0.15625，所以订单的机会¹有15.625％两套结构。

完全不奇怪，这正是零比雷埃夫斯测量的结果。 p>

_{¹技术上这个并不明显。我们可以假设5个哈希中的每一个都是因为重新哈希，但是由于线性探测，它实际上更可能发生聚束结构......但是因为我们只关注是否占用一个时隙，所以它不会实际上并不影响我们。}
Given Zero Piraeus' answer to another question, we have that
x = tuple(set([1, "a", "b", "c", "z", "f"])) y = tuple(set(["a", "b", "c", "z", "f", 1])) print(x == y)
Prints True about 85% of the time with hash randomization enabled. Why 85%?
解决方案
I'm going to assume any readers of this question to have read both:

Zero Piraeus' answer and

My explanation of CPython's dictionaries.

The first thing to note is that hash randomization is decided on interpreter start-up.

The hash of each letter will be the same for both sets, so the only thing that can matter is if there is a collision (where order will be affected).

By the deductions of that second link we know the backing array for these sets starts at length 8:
_ _ _ _ _ _ _ _
In the first case, we insert 1:
_ 1 _ _ _ _ _ _
and then insert the rest:
α 1 ? ? ? ? ? ?
Then it is rehashed to size 32:
1 can't collide with α as α is an even hash ↓ so 1 is inserted at slot 1 first ? 1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
In the second case, we insert the rest:
? β ? ? ? ? ? ?
And then try to insert 1:
Try to insert 1 here, but will ↓ be rehashed if β exists ? β ? ? ? ? ? ?
And then it will be rehashed:
Try to insert 1 here, but will be rehashed if β exists and has ↓ not rehashed somewhere else ? β ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
So whether the iteration orders are different depends solely on whether β exists.

The chance of a β is the chance that any of the 5 letters will hash to 1 modulo 8 and hash to 1 modulo 32.

Since anything that hashes to 1 modulo 32 also hashes to 1 modulo 8, we want to find the chance that of the 32 slots, one of the five is in slot 1:
5 (number of letters) / 32 (number of slots)
5/32 is 0.15625, so there is a 15.625% chance¹ of the orders being different between the two set constructions.

Not very strangely at all, this is exactly what Zero Piraeus measured.

_{¹Technically even this isn't obvious. We can pretend every one of the 5 hashes uniquely because of rehashing, but because of linear probing it's actually more likely for "bunched" structures to occur... but because we're only looking at whether a single slot is occupied, this doesn't actually affect us.}

这篇关于为什么元组（[1，“a”，“b”，“c”，“z”，“f”）））== tuple（set（[“a”，“ b“，”c“，”z“，”f“，1]））启用散列随机化的时间的85％？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭