为什么人们说使用随机数发生器时存在模偏差？ [英] Why do people say there is modulo bias when using a random number generator?

查看：454 发布时间：2016/10/11 19:49:27 c++ random language-agnostic modulo

本文介绍了为什么人们说使用随机数发生器时存在模偏差？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我看过这个问题很多，但从来没有见过真正具体的答案。所以我要在这里发布一个，这将有希望帮助人们理解为什么在使用随机数字生成器，如 rand（）在C ++中为什么有模数偏差。 / p>

解决方案

因此 rand（）是一个伪随机数生成器一个在 cstdlib 中定义的常量的自然数0到 RAND_MAX （参见文章，了解 rand（））的概述。 p>

现在如果你想生成一个介于0和2之间的随机数，会发生什么？为了解释，让我们说 RAND_MAX 是10，我决定通过调用 rand（）％3生成0和2之间的随机数。 / code>。但是， rand（）％3 不会以等概率产生0和2之间的数字！

当 rand（）返回0,3,6或9时， code> rand（）％3 == 0 。因此，当 rand（）返回1,4时，P（0）= 4/11

< 7或10， c> rand（）％3 == 1 。因此，当 rand（）返回2，5，则P（1）= 4/11

<或8， rand（）％3 == 2 。因此，P（2）= 3/11

等概率。当然对于小范围，这可能不是最大的问题，但对于更大的范围，这可能偏斜分布，偏向较小的数字。

所以当 rand（）％n 以相等的概率返回从0到n-1的数字范围？当 RAND_MAX％n == n - 1 时。在这种情况下，与我们早先的假设 rand（）确实以相等的概率返回0和 RAND_MAX n的模数类也将均匀分布。

那么我们如何解决这个问题呢？一个粗略的方法是保持生成随机数，直到您获得所需范围内的数字：

  int x; 
 do {
 x = rand（）; 
} while（x> = n）;

但对于 n ，因为你只有一个 n / RAND_MAX 的机会获得一个值在你的范围内，所以你需要执行 RAND_MAX / n 平均调用 rand（）。

 
 
 更有效的公式方法是一些大范围，长度可由 n 整除，如 RAND_MAX  -  RAND_MAX％n ，保持生成随机数，直到得到一个在范围内，然后取模量：
  int x; 
 
 do {
 x = rand（）; 
} while（x> =（RAND_MAX-RAND_MAX％n））; 
 
 x％= n; 
  
对于 n 的小值，很少需要多次调用 rand（）。
 
 
 
 
 
 着作和进一步阅读：
 
 
  
    CPlusPlus参考 
 
 
   永久困惑 
 
 
 
 
 
 
 
I have seen this question asked a lot but never seen a true concrete answer to it. So I am going to post one here which will hopefully help people understand why exactly there is "modulo bias" when using a random number generator, like rand() in C++.
 解决方案 
So rand() is a pseudo-random number generator which chooses a natural number between 0 and RAND_MAX, which is a constant defined in cstdlib (see this article for a general overview on rand()).

Now what happens if you want to generate a random number between say 0 and 2? For the sake of explanation, let's say RAND_MAX is 10 and I decide to generate a random number between 0 and 2 by calling rand()%3. However, rand()%3 does not produce the numbers between 0 and 2 with equal probability! 

When rand() returns 0, 3, 6, or 9, rand()%3 == 0. Therefore, P(0) = 4/11

When rand() returns 1, 4, 7, or 10, rand()%3 == 1. Therefore, P(1) = 4/11 

When rand() returns 2, 5, or 8, rand()%3 == 2. Therefore, P(2) = 3/11

This does not generate the numbers between 0 and 2 with equal probability. Of course for small ranges this might not be the biggest issue but for a larger range this could skew the distribution, biasing the smaller numbers. 

So when does rand()%n return a range of numbers from 0 to n-1 with equal probability? When RAND_MAX%n == n - 1. In this case, along with our earlier assumption rand() does return a number between 0 and RAND_MAX with equal probability, the modulo classes of n would also be equally distributed.

So how do we solve this problem? A crude way is to keep generating random numbers until you get a number in your desired range:
int x; 
do {
    x = rand();
} while (x >= n);
but that's inefficient for low values of n, since you only have a n/RAND_MAX chance of getting a value in your range, and so you'll need to perform RAND_MAX/n calls to rand() on average.

A more efficient formula approach would be to take some large range with a length divisible by n, like RAND_MAX - RAND_MAX % n, keep generating random numbers until you get one that lies in the range, and then take the modulus:
int x;

do {
    x = rand();
} while (x >= (RAND_MAX - RAND_MAX % n));

x %= n;
For small values of n, this will rarely require more than one call to rand().



Works cited and further reading:


CPlusPlus Reference
Eternally Confuzzled 




                        这篇关于为什么人们说使用随机数发生器时存在模偏差？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

为什么人们说使用随机数发生器时存在模偏差？ [英] Why do people say there is modulo bias when using a random number generator?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

为什么人们说使用随机数发生器时存在模偏差？ [英] Why do people say there is modulo bias when using a random number generator?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭