Pythons random.randint统计上是随机的吗? [英] Is Pythons random.randint statistically random?

查看:593
本文介绍了Pythons random.randint统计上是随机的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我正在测试游戏中某些掷骰子的概率的计算. 基本情况是滚动一个10面的模具会死.

So I'm testing an calculating the probabilities of certain dice rolls, for a game. The base case if that rolling one 10sided die.

我对此做了一百万个样本,最终得到以下比例:

I did a million samples of this, and ended up with the following proportions:

Result
0       0.000000000000000%
1       10.038789961210000%
2       10.043589956410000%
3       9.994890005110000%
4       10.025289974710000%
5       9.948090051909950%
6       9.965590034409970%
7       9.990190009809990%
8       9.985490014509990%
9       9.980390019609980%
10      10.027589972410000%

这些当然都应该是10%. 这些结果的标准偏差为0.0323207%. 在我看来,这似乎很高. 只是巧合吗? 据我了解,随机模块访问正确的伪随机数. 即来自通过统计检验的方法是随机的. 还是这些伪伪随机数生成器

These should of course all be 10%. There is a standard deviation of 0.0323207% in these results. that, to me, seems rather high. Is it just coincidence? As I understand it the random module accesses proper pseudo-random numbers. Ie ones from a method that pass the statistical tests to be random. Or are these pseudo-pseudo-random number generators

我应该使用加密的伪随机数生成器吗? 我相当确定我不需要 true 随机数生成器(请参见 http://www.random.org/ http://en.wikipedia.org/wiki/Hardware_random_number_generator ).

Should I be using cryptographic pseudo-random number generators? I'm fairly sure I don't need a true random number generator (see http://www.random.org/, http://en.wikipedia.org/wiki/Hardware_random_number_generator).

我目前正在用10亿个样本重新生成我的所有结果, (为什么不呢,我有一个笨拙的服务器供我使用,还需要一些时间做些事情)

I am currently regenerating all my results with 1 billion samples, (cos why not, I have a crunchy server at my disposal, and some sleep to do)

推荐答案

Martijn的答案是对Python可以访问的随机数生成器的简洁总结.

Martijn's answer is a pretty succinct review of the random number generators that Python has access to.

如果要检查生成的伪随机数据的属性,请从 http://下载random.zip://www.fourmilab.ch/random/,然后在大量随机数据样本上运行它.特别是χ²(卡方)检验对随机性非常敏感.为了使序列真正随机,χ²检验的百分比应在10%到90%之间.

If you want to check out the properties of the generated pseudo-random data, download random.zip from http://www.fourmilab.ch/random/, and run it on a big sample of random data. Especially the χ² (chi squared) test is very sensitive to randomness. For a sequence to be really random, the percentage from the χ² test should be between 10% and 90%.

对于游戏,我猜想Python内部使用的Mersenne Twister应该足够随机(除非您要建立在线赌场:-).

For a game I'd guess that the Mersenne Twister that Python uses internally should be sufficiently random (unless you're building an online casino :-).

如果希望具有随机性,并且使用Linux,则可以从/dev/random中读取.这只会从内核的熵池中生成随机数据(该数据是从中断到达的不可预测的时间收集的),因此如果您用尽了它,它将阻塞.此熵用于初始化(种子)/dev/urandom使用的PRNG.在FreeBSD上,为/dev/random提供数据的PRNG使用Yarrow算法,该算法通常被认为是加密安全的.

If you want pure randomness, and if you are using Linux, you can read from /dev/random. This only produces random data from the kernel's entropy pool (which is gathered from the unpredictable times that interrupts arrive), so it will block if you exhaust it. This entropy is used to initialize (seed) the PRNG used by /dev/urandom. On FreeBSD, the PRNG that supplies data for /dev/random uses the Yarrow algorithm, which is generally regarded as being cryptographically secure.

编辑:我对来自random.randint的字节进行了一些测试.首先创建一百万个随机字节:

I ran some tests on bytes from random.randint. First creating a million random bytes:

import random
ba = bytearray([random.randint(0,255) for n in xrange(1000000)])
with open('randint.dat', 'w+') as f:
    f.write(ba)

然后我从 Fourmilab 上运行ent程序:

Entropy = 7.999840 bits per byte.

Optimum compression would reduce the size
of this 1000000 byte file by 0 percent.

Chi square distribution for 1000000 samples is 221.87, and randomly
would exceed this value 93.40 percent of the times.

Arithmetic mean value of data bytes is 127.5136 (127.5 = random).
Monte Carlo value for Pi is 3.139644559 (error 0.06 percent).
Serial correlation coefficient is -0.000931 (totally uncorrelated = 0.0).

现在进行χ²检验,您从50%得到的数据越多,对数据的怀疑就越大.如果一个人非常挑剔,则小于10%或> 90%的值被认为是不可接受的. ent的作者John Walker将该值称为几乎可疑".

Now for the χ² test, the further you get from 50%, the more suspect the data is. If one is very fussy, values <10% or >90% are deemed unacceptable. John Walker, author of ent calls this value "almost suspect".

作为对比,这是我之前运行的来自FreeBSD的Yarrow prng的10 MiB的相同分析:

As a contrast, here is the same analysis of 10 MiB from FreeBSD's Yarrow prng that I ran earlier:

Entropy = 7.999982 bits per byte.

Optimum compression would reduce the size
of this 10485760 byte file by 0 percent.

Chi square distribution for 10485760 samples is 259.03, and randomly
would exceed this value 41.80 percent of the times.

Arithmetic mean value of data bytes is 127.5116 (127.5 = random).
Monte Carlo value for Pi is 3.139877754 (error 0.05 percent).
Serial correlation coefficient is -0.000296 (totally uncorrelated = 0.0).

虽然其他数据似乎没有太大差异,但χ²百分比很多接近50%.

While there seems not much difference in the other data, the χ² precentage is much closer to 50%.

这篇关于Pythons random.randint统计上是随机的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆