是否“n * (rand()/RAND_MAX)"?做一个倾斜的随机数分布? [英] Does "n * (rand() / RAND_MAX)" make a skewed random number distribution?

查看:14
本文介绍了是否“n * (rand()/RAND_MAX)"?做一个倾斜的随机数分布?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想找到一种在 C 中获取随机数的不偏不倚的方法(尽管我最多将它用于 0-20 的值,更可能只用于 0-8).我已经看过这个公式,但是在运行了一些测试之后,我不确定它是否有偏差.有什么帮助吗?

这是使用的完整功能:

int randNum(){返回 1 + (int) (10.0 * (rand()/(RAND_MAX + 1.0)));}

我使用以下方法播种:

unsigned int iseed = (unsigned int)time(NULL);srand(种子);

我尝试过下面建议的那个拒绝为我工作

int 希腊语;对于 (j=0;j<50000;j++){希腊语 =rand_lim(5);printf("%d, " 希腊语);希腊语 =(int) (NUM * (rand()/(RAND_MAX + 1.0)));int togo=数字[希腊语];数字[希腊语]=多哥+1;}

当我注释掉 printf 时,它停止工作并给我同样的数字 50000 次.

解决方案

是的,它是倾斜的,除非你的 RAND_MAX 恰好是 10 的倍数.

如果你取从 0 到 RAND_MAX 的数字,并尝试将它们分成 10 堆,你真的只有三种可能:

  1. RAND_MAX 是 10 的倍数,堆出来的都是偶数.
  2. RAND_MAX 不是 10 的倍数,堆出来的不均匀.
  3. 一开始你把它分成不均匀的组,但扔掉所有会使其不均匀的额外".

你很少能控制 RAND_MAX,而且它通常是一个素数.这真的只剩下 2 和 3 的可能性.

第三个选项大致如下: = {0};诠释我;对于 (i=0; i

所以,我们从 0 到 1009 的数字开始(1009 是质数,所以它不会是我们选择的任何范围的精确倍数).因此,我们从 1009 个数字开始,并将其分成 10 个桶.这应该在每个桶中提供 100 个,并且 9 个剩菜(可以这么说)被 do/while 循环吃掉".正如它现在所写的那样,它分配并打印出一个额外的桶.当我运行它时,我在每个存储桶 0..9 中得到正好 100,在存储桶 10 中得到 0.如果我注释掉 do/while 循环,我看到0..9 各 100 个,桶 10 各 9 个.

为了确定起见,我已经针对生成的范围(主要使用素数)和存储桶的数量使用各种其他数字重新运行了测试.到目前为止,我还不能让它为任何范围产生偏斜的结果(当然,只要启用 do/while 循环).

另一个细节:我在这个算法中使用除法而不是余数是有原因的.rand() 的良好(甚至是体面的)实现是无关紧要的,但是当你使用除法将数字限制在一个范围内时,你会保持 em> 位的输入.当您使用余数执行此操作时,您会保留输入的 lower 位.碰巧的是,对于典型的线性同余伪随机数生成器,低位的随机性往往低于高位.一个合理的实现会丢弃一些最低有效位,从而使这无关紧要.另一方面,rand 的一些实现很差,而 most 使用除法而不是余数会得到更好的输出质量.

我还应该指出,个生成器的作用大致相反——低位比高位更随机.至少在我的经验中,这些是相当少见的.高位更随机的情况相当更常见.

I'd like to find an unskewed way of getting random numbers in C (although at most I'm going to be using it for values of 0-20, and more likely only 0-8). I've seen this formula but after running some tests I'm not sure if it's skewed or not. Any help?

Here is the full function used:

int randNum() 
{ 
    return 1 + (int) (10.0 * (rand() / (RAND_MAX + 1.0)));
}

I seeded it using:

unsigned int iseed = (unsigned int)time(NULL);
srand (iseed);

The one suggested below refuses to work for me I tried

int greek; 
for (j=0; j<50000; j++) 
{ 
greek =rand_lim(5); 
printf("%d, " greek); 
greek =(int) (NUM * (rand() / (RAND_MAX + 1.0))); 
int togo=number[greek]; 
number[greek]=togo+1; 
}

and it stops working and gives me the same number 50000 times when I comment out printf.

解决方案

Yes, it's skewed, unless your RAND_MAX happens to be a multiple of 10.

If you take the numbers from 0 to RAND_MAX, and try to divide them into 10 piles, you really have only three possibilities:

  1. RAND_MAX is a multiple of 10, and the piles come out even.
  2. RAND_MAX is not a multiple of 10, and the piles come out uneven.
  3. You split it into uneven groups to start with, but throw away all the "extras" that would make it uneven.

You rarely have control over RAND_MAX, and it's often a prime number anyway. That really only leaves 2 and 3 as possibilities.

The third option looks roughly like this: [Edit: After some thought, I've revised this to produce numbers in the range 0...(limit-1), to fit with the way most things in C and C++ work. This also simplifies the code (a tiny bit).

int rand_lim(int limit) {
/* return a random number in the range [0..limit)
 */

    int divisor = RAND_MAX/limit;
    int retval;

    do { 
        retval = rand() / divisor;
    } while (retval == limit);

    return retval;
}

For anybody who questions whether this method might leave some skew, I also wrote a rather different version, purely for testing. This one uses a decidedly non-random generator with a very limited range, so we can simply iterate through every number in the range. It looks like this:

#include <stdlib.h>
#include <stdio.h>

#define MAX 1009

int next_val() {
    // just return consecutive numbers
    static int v=0;

    return v++;
}

int lim(int limit) {
    int divisor = MAX/limit;
    int retval;

    do {
        retval = next_val() / divisor;
    } while (retval == limit);

    return retval;
}

#define LIMIT 10

int main() {

    // we'll allocate extra space at the end of the array:
    int buckets[LIMIT+2] = {0};
    int i;

    for (i=0; i<MAX; i++)
        ++buckets[lim(LIMIT)];

    // and print one beyond what *should* be generated
    for (i=0; i<LIMIT+1; i++)
        printf("%2d: %d
", i, buckets[i]);
}

So, we're starting with numbers from 0 to 1009 (1009 is prime, so it won't be an exact multiple of any range we choose). So, we're starting with 1009 numbers, and splitting it into 10 buckets. That should give 100 in each bucket, and the 9 leftovers (so to speak) get "eaten" by the do/while loop. As it's written right now, it allocates and prints out an extra bucket. When I run it, I get exactly 100 in each of buckets 0..9, and 0 in bucket 10. If I comment out the do/while loop, I see 100 in each of 0..9, and 9 in bucket 10.

Just to be sure, I've re-run the test with various other numbers for both the range produced (mostly used prime numbers), and the number of buckets. So far, I haven't been able to get it to produce skewed results for any range (as long as the do/while loop is enabled, of course).

One other detail: there is a reason I used division instead of remainder in this algorithm. With a good (or even decent) implementation of rand() it's irrelevant, but when you clamp numbers to a range using division, you keep the upper bits of the input. When you do it with remainder, you keep the lower bits of the input. As it happens, with a typical linear congruential pseudo-random number generator, the lower bits tend to be less random than the upper bits. A reasonable implementation will throw out a number of the least significant bits already, rendering this irrelevant. On the other hand, there are some pretty poor implementations of rand around, and with most of them, you end up with better quality of output by using division rather than remainder.

I should also point out that there are generators that do roughly the opposite -- the lower bits are more random than the upper bits. At least in my experience, these are quite uncommon. That with which the upper bits are more random are considerably more common.

这篇关于是否“n * (rand()/RAND_MAX)"?做一个倾斜的随机数分布?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆