Python 2和3中的不确定集 [英] Indeterministic sets in Python 2 and 3
问题描述
集合是无序值的集合.如果我通过集合文字构造一个集合,例如
Sets are collections of unordered values. If I construct a set via a set literal, e.g.
s = {'a', 'b', 'c'}
然后将其打印,我以某种扰乱的顺序获取了元素.但是,似乎在Python 2.7中,上面的示例始终导致相同的顺序:
and then print it, I get the elements in some scrambled order. However, it seems that in Python 2.7, the above example always results in the same ordering:
print(s) # set(['a', 'c', 'b']) in Python 2.7
Python 2.7如何确定此顺序?甚至'a'
,'b'
和'c'
的哈希也不按顺序生成.
How does Python 2.7 decide on this ordering? Even the hashes of 'a'
, 'b'
and 'c'
are not in the order produced.
在Python 3.x(包括3.6,其中按dict
键进行排序)中,尽管在给定的Python进程中始终相同,但结果顺序似乎是随机的.也就是说,只要我不重新启动Python解释器,反复重建set文字总是会导致相同的顺序.
In Python 3.x (including 3.6 where dict
keys are ordered) the resulting order seems to be random, though always the same within a given Python process. That is, repeatedly re-building the set literal always lead to the same ordering, as long as I do not restart the Python interpreter.
要检查多个Python进程的顺序,请考虑bash代码
To check the ordering across multiple Python processes, consider the bash code
(for _ in {1..50}; do python3 -c "s = {'a', 'b', 'c'}; print(s)"; done) | sort -u
(通常)这将显示3种元素的6种不同排列方式.用python
(2)切换python3
时,我们只看到顺序['a', 'c', 'b']
.什么决定了Python 3的顺序?
This will (most often) show the 6 different ways the 3 elements can be arranged. Switching out python3
with python
(2), we only see the ordering ['a', 'c', 'b']
. What determines the ordering in Python 3?
我看到对象的hash
值在Python 2中是确定性的,而在Python 3中是随机的(尽管在Python进程中是常数).我相信这是完整说明的关键.
I see that the hash
value of objects are deterministic in Python 2 while random (though constant within a Python process) in Python 3. I'm sure this is key to the full explanation.
正如deceze在他的评论中所写,我想知道Python是否显式地为实现这种随机化而做某件事,或者它是否是免费"发生的.
As deceze writes in his comment, I would like to know if Python explicitly does something just to achieve this randomization, or if it happens "for free".
推荐答案
Python 3(从Python 3.3开始)差异的原因是默认情况下启用了哈希随机化,您可以通过设置 PYTHONHASHSEED
环境变量设置为固定值:
The reason for the difference in Python 3 (from Python 3.3 onwards) is that hash randomization is enabled by default, you could turn this off by setting the PYTHONHASHSEED
environmental variable to a fixed value:
$ export PYTHONHASHSEED=0
$ (for _ in {1..50}; do python3 -c "s = {'a', 'b', 'c'}; print(s)"; done) | sort -u
{'a', 'b', 'c'}
同样,您可以使用 -R
标志:
Equally you can turn hash randomization on in Python 2 with the -R
flag:
$ (for _ in {1..50}; do python2 -R -c "s = {'a', 'b', 'c'}; print(s)"; done) | sort -u
set(['a', 'b', 'c'])
set(['a', 'c', 'b'])
set(['b', 'c', 'a'])
set(['c', 'b', 'a'])
请注意,您通常不希望将其关闭,因为启用哈希随机化有助于防止某些拒绝服务攻击.
Note, you don't generally want to turn it off since having hash randomization enabled helps protect against certain denial-of-service attacks.
这篇关于Python 2和3中的不确定集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!