Python 2和3中的不确定集 [英] Indeterministic sets in Python 2 and 3

查看:147
本文介绍了Python 2和3中的不确定集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

集合是无序值的集合.如果我通过集合文字构造一个集合,例如

Sets are collections of unordered values. If I construct a set via a set literal, e.g.

s = {'a', 'b', 'c'}

然后将其打印,我以某种扰乱的顺序获取了元素.但是,似乎在Python 2.7中,上面的示例始终导致相同的顺序:

and then print it, I get the elements in some scrambled order. However, it seems that in Python 2.7, the above example always results in the same ordering:

print(s)  # set(['a', 'c', 'b']) in Python 2.7

Python 2.7如何确定此顺序?甚至'a''b''c'的哈希也不按顺序生成.

How does Python 2.7 decide on this ordering? Even the hashes of 'a', 'b' and 'c' are not in the order produced.

在Python 3.x(包括3.6,其中按dict键进行排序)中,尽管在给定的Python进程中始终相同,但结果顺序似乎是随机的.也就是说,只要我不重新启动Python解释器,反复重建set文字总是会导致相同的顺序.

In Python 3.x (including 3.6 where dict keys are ordered) the resulting order seems to be random, though always the same within a given Python process. That is, repeatedly re-building the set literal always lead to the same ordering, as long as I do not restart the Python interpreter.

要检查多个Python进程的顺序,请考虑bash代码

To check the ordering across multiple Python processes, consider the bash code

(for _ in {1..50}; do python3 -c "s = {'a', 'b', 'c'}; print(s)"; done) | sort -u

(通常)这将显示3种元素的6种不同排列方式.用python(2)切换python3时,我们只看到顺序['a', 'c', 'b'].什么决定了Python 3的顺序?

This will (most often) show the 6 different ways the 3 elements can be arranged. Switching out python3 with python(2), we only see the ordering ['a', 'c', 'b']. What determines the ordering in Python 3?

我看到对象的hash值在Python 2中是确定性的,而在Python 3中是随机的(尽管在Python进程中是常数).我相信这是完整说明的关键.

I see that the hash value of objects are deterministic in Python 2 while random (though constant within a Python process) in Python 3. I'm sure this is key to the full explanation.

正如deceze在他的评论中所写,我想知道Python是否显式地为实现这种随机化而做某件事,或者它是否是免费"发生的.

As deceze writes in his comment, I would like to know if Python explicitly does something just to achieve this randomization, or if it happens "for free".

推荐答案

Python 3(从Python 3.3开始)差异的原因是默认情况下启用了哈希随机化,您可以通过设置 PYTHONHASHSEED 环境变量设置为固定值:

The reason for the difference in Python 3 (from Python 3.3 onwards) is that hash randomization is enabled by default, you could turn this off by setting the PYTHONHASHSEED environmental variable to a fixed value:

$ export PYTHONHASHSEED=0
$ (for _ in {1..50}; do python3  -c "s = {'a', 'b', 'c'}; print(s)"; done) | sort -u
{'a', 'b', 'c'}

同样,您可以使用 -R标志:

Equally you can turn hash randomization on in Python 2 with the -R flag:

$ (for _ in {1..50}; do python2 -R -c "s = {'a', 'b', 'c'}; print(s)"; done) | sort -u
set(['a', 'b', 'c'])
set(['a', 'c', 'b'])
set(['b', 'c', 'a'])
set(['c', 'b', 'a'])

请注意,您通常不希望将其关闭,因为启用哈希随机化有助于防止某些拒绝服务攻击.

Note, you don't generally want to turn it off since having hash randomization enabled helps protect against certain denial-of-service attacks.

这篇关于Python 2和3中的不确定集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆