1到sys.maxsize范围内的随机数始终为1 mod 2 ^ 10 [英] Random number in the range 1 to sys.maxsize is always 1 mod 2^10

查看:67
本文介绍了1到sys.maxsize范围内的随机数始终为1 mod 2 ^ 10的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过使用频率测试,运行测试和卡方检验来找到Python(2.7.10)中可用的PRNG的统计属性.

I am trying to find the statistical properties of the PRNGs available in Python (2.7.10) by using the frequency test, runs test and the chi squared test.

为了进行频率测试,我需要将生成的随机数转换为其二进制表示形式,然后计算10的分布.我正在python控制台上试验随机数的二进制表示形式,并观察到这种奇怪的行为:

For carrying out the frequency test, I need to convert the generated random number to its binary representation and then count the distribution of 1's and 0's. I was experimenting with the binary representation of the random numbers on the python console and observed this weird behavior:

>>> for n in random.sample(xrange(1, sys.maxsize), 50):
...     print '{0:b}'.format(n)
... 
101101110011011001110011110110101101101101111111101000000000001
110000101001001011101001110111111110011000101011100010000000001
110111101101110011100010001010000101011111110010001110000000001
100001111010011000101001000001000011001111100000001010000000001
1111000010010011111100111110110100100011110111010000000000001
111000001011101011101110100001001001000011011001110110000000001
1000100111011000111000101010000101010100110111000100000000001
11101001000001101111110101111011001000100011011011010000000001
110011010111101101011000110011011001110001111000001010000000001
110110110110111100011111110111011111101000011001100000000001
100010010000011101011100110101011110111100001100100000000000001
10111100011010011010001000101011001110010010000010010000000001
101011100110110001010110000101100000111111011101011000000000001
1111110010110010000111111000010001101011011010101110000000001
11100010101101110110101000101101011011111101101000010000000001
10011110110110010110011010000110010010111001111001010000000001
110110011100111010100111100100000100011101100001100000000000001
100110011001101011110011010101111101100010000111001010000000001
111000101101100111110010110110100110111001000101000000000000001
111111101000010111001011111100111100011101001011010000000001
11110001111100000111010010011111010101101110111001010000000001
100001100101101100010101111100111101111001101010101010000000001
11101010110011000001101110000000001111010001110111000000000001
100111000110111010001110110101001011100101111101010000000001
100001101100000011101101010101111111011010111110111110000000001
100010010011110110111111111000010001101100111001001100000000001
110011111110010011000110101010101001001010000100011010000000001
1111011010100001001101101000011100001011001110010100000000001
110110011101100101001100111010101111001011111101100000000000001
1010001110100101001001011111000111011100001100000110000000001
1000101110010011011000001011010110001000110100100100000000001
11111110011001011100111110110111000001000100100010000000000001
101111101010000101010111111111000001100101111001011110000000001
10010010111111111100000001010010101100111001100000000000001
111110000001110010001110111101110101010110001110000000000000001
100000101101000110101010010000101101000011111010001110000000001
101001011101100011001000011010010000000111110111100010000000001
10110101010000111010110111001111011000001111001100110000000001
10110111100100100011100101001100000000101110100100010000000001
10010111110001011101001110000111011010110100110111110000000001
111011110010110111011011101011001100001000111001010100000000001
101001010001010100010010010001100111101110101111000110000000001
101011111010000101010101000110001101001001011110000000000001
1010001010111101101010111110110110000001111101101110000000001
10111111111010001000110000101101010101011010101100000000001
101011101010110000001111010100100110000011111100100100000000001
111100001101111010100111010001010010000010110110010110000000001
100111111000100110100001110101000010111111010010010000000000001
100111100001011100011000000000101100111111000111100110000000001
110110100000110111011101110101101000101110111111010110000000001
>>> 

如您所见,所有数字都以0000000001结尾,即所有数字均为1 mod 2^10.为什么会这样?

As you can see, all numbers end in 0000000001, i.e all numbers are 1 mod 2^10. Why is this so ?

此外,当范围为1 to sys.maxsize时,会观察到此行为.如果将范围指定为1 to 2^40,则不会出现此情况.我想知道这种现象的原因,以及我的代码中是否有任何错误.

Also, this behavior is observed when the range is 1 to sys.maxsize. If the range is specified to be 1 to 2^40, this is not observed. I want to know the reason for this behavior and whether there is anything wrong in my code.

实现我正在使用的PRNG的随机库的文档位于此处.

The documentation for the random library that implements the PRNGs that I am using is here.

让我知道是否应该提供更多信息.

Let me know if I should provide any more information.

推荐答案

@roeland提示了原因:在Python 2中,sample()反复使用int(random.random() * n).查看源代码(在Python的Lib/random.py中)以获取完整的详细信息.简而言之,random.random()返回不超过53个有效(非零)前导位.然后int()用零填充其余的低阶位(显然,您在使用sys.maxsize == 2**63 - 1的计算机上);然后用具有很多低" 0位的偶数整数对您的基数(xrange(1, sys.maxsize))进行索引,始终会返回一个具有相同数量的低0位(除了最后一个)的奇数整数.

@roeland hinted at the cause: in Python 2, sample() uses int(random.random() * n) repeatedly. Look at the source code (in your Python's Lib/random.py) for full details. In short, random.random() returns no more than 53 significant (non-zero) leading bits; then int() fills the rest of the low-order bits with zeroes (you're obviously on a machine where sys.maxsize == 2**63 - 1); then indexing your base (xrange(1, sys.maxsize)) by an even integer with "a lot" of of low-order 0 bits always returns an odd integer with the same number of low-order 0 bits (except for the last).

在Python 3中什么也没有发生-在Python 3中random使用更强大的算法,并且只有在必要时才回退到random.random().例如,在Python 3.4.3下:

In Python 3 none of that happens - random in Python 3 uses stronger algorithms, and only falls back to random.random() when necessary. For example, here under Python 3.4.3:

>>> hex(random.randrange(10**70))
'0x91fc11ed768be3a454bd66f593c218d8bbfa3b99f6285291e1d9f964a9'
>>> hex(random.randrange(10**70))
'0x7b07ff02b6676801e33094fca2fcca7f6e235481c479c521643b1acaf4'

编辑

这是一个更直接相关的示例,在64位设备上的3.4.3下:

EDIT

Here's a more directly relevant example, under 3.4.3 on a 64-bit box:

>>> import random, sys
>>> sys.maxsize == 2**63 - 1
True
>>> for i in random.sample(range(1, sys.maxsize), 6):
...    print(bin(i))
0b10001100101001001111110110011111000100110100111001100000010110
0b100111100110110100111101001100001100110001110010000101101000101
0b1100000001110000110100111101101010110001100110101111011100111
0b111110100001111100101001001001101101100100011001001010100001110
0b1100110100000011100010000011010010100100110111001111100110100
0b10011010000110101010101110001000101110111100100001111101110111

在这种情况下,Python 3根本不会调用random.random(),而是从底层的梅森·Twister迭代地获取32位块(32位无符号int是此MT实现的自然"输出),将它们粘贴在一起以建立合适的索引.因此,在Python 3中,平台浮动与它无关.在Python 2中,浮动行为的怪癖与之有关.

Python 3 doesn't invoke random.random() at all in this case, but instead iteratively grabs chunks of 32 bits from the underlying Mersenne Twister (32-bit unsigned ints are "the natural" outputs from this implementation of MT) , pasting them together to build a suitable index. So, in Python 3, platform floats have nothing to do with it; in Python 2, quirks of float behavior have everything to do with it.

这篇关于1到sys.maxsize范围内的随机数始终为1 mod 2 ^ 10的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆