Set.pop() 不是随机的? [英] Set.pop() isn't random?

查看:19
本文介绍了Set.pop() 不是随机的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自 python 文档,set.pop() 从 s" 中删除并返回任意元素.在生成一些随机数据来测试程序时,我注意到这个 pop() 函数的奇怪行为.这是我的代码(python 2.7.3):

From the python docs, "set.pop() remove and return an arbitrary element from s". While generating some random data to test a program, I noticed strange behavior of this pop() function. Here is my code (python 2.7.3):

testCases = 10
numberRange = 500

poppedValues = []
greaterPercentages = []

for i in range (testCases):
    s = Set()

    """ inserting 100 random values in the set, in the range [0, numberRange) """
    for j in range (100):
        s.add(random.randrange(numberRange)) 

    poppedValue = s.pop()
    greaterCount = 0

    """ counting how many numbers in the set are smaller then the popped value """
    for number in s:
        if poppedValue > number:
            greaterCount += 1

    poppedValues.append(poppedValue)
    greaterPercentages.append(float(greaterCount) / len(s) * 100)

for poppedValue in poppedValues:
    print poppedValue, '	',

print

for percentage in greaterPercentages:
    print "{:2.2f}".format(percentage), '	',

我在这里做的是,

  1. 在集合 s 中插入一些随机值,其中每个元素都在 [0, numberRange) 范围内
  2. 从集合中弹出一个元素(根据文档,它应该是一个随机的)
  3. 计算集合中有多少元素小于弹出值
  1. Inserting some random values in the set s where each element is in the range [0, numberRange)
  2. Pop an element from the set (according to the docs, it should be a random one)
  3. Counting how many elements in the set are smaller then the popped value

我预计弹出的值应该是一个随机值,并且集合中大约 50% 的数字会大于弹出的值.但似乎 pop() 几乎总是返回集合中的最低数字.下面是 numberRange = 500 的结果.第一行表示弹出元素的值.第二行是小于弹出值的元素的百分比.

I expected that the popped value should be a random one and about 50% of the numbers in the set will be greater then the popped value. But seems that pop() almost always returns the lowest number in the set. Here are the result for numberRange = 500. First row denotes the values of the popped element. Second row is the percentage of elements which are smaller then the popped value.

9   0   3   1   409     0   1   2   4   0   
0 % 0 % 0 % 0 % 87 %    0 % 0 % 0 % 0 % 0 %

我使用不同的 numberRange 值进行了此测试.似乎对于集合元素的较低值,pop() 几乎总是返回最低的元素.但是对于更高的值,它返回一个随机元素.对于 numberRange = 1000,结果是:

I've conducted this test with different values of numberRange. It seems that for lower values of the set elements, pop() almost always returns the lowest element. But for higher values it returns a random element. For numberRange = 1000, the result is:

518     3586    3594    4103    2560    3087    4095    3079    3076    1622    
7 %     72 %    73 %    84 %    54 %    51 %    79 %    63 %    67 %    32 %

我认为这是非常随机的.为什么会有这种奇怪的行为?我做错了什么吗?

which I think is pretty random. Why this strange behavior? Am I doing something wrong?

编辑:感谢大家的回答和评论,似乎任意"并不能保证它会随机".

EDIT: Thanks for everyone's answer and comment, seems that by "arbitrarily", it isn't guaranteed that it will be "random".

推荐答案

It's an implementation detail - set 被实现为 HashMap(类似于 dict 但没有插槽对于一个值),set.pop 删除 HashMap 中的第一个条目,并且一个 int 的哈希值与 int 相同.

It's an implementation detail - set is implemented as a HashMap (similar to dict but without a slot for a value), set.pop removes the first entry in the HashMap, and an ints hash value is the same int.

结合起来,这意味着您的 set,按哈希值排序,实际上也按条目modulo hashtable size 排序;在您的情况下,这应该接近自然排序,因为您只插入小范围内的数字 - 如果您从 randrange(10**10) 而不是 randrange(500) 中获取随机数 您应该会看到不同的行为.此外,根据您的插入顺序,由于哈希冲突,您可以从原始哈希顺序中获取一些值.

Combined, this means that your set, which is ordered by the hash values, is actually ordered by the entries modulo hashtable size as well; this should be close to natural ordering in your case as you are only inserting numbers from a small range - if you take random numbers from randrange(10**10) instead of randrange(500) you should see a different behaviour. Also, depending on your insertion order, you can get some values out of their original hashing order due to hash collisions.

这篇关于Set.pop() 不是随机的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆