使用rand()在(a,b),[a,b),(a,b]和[a,b]上生成均匀分布的浮点数 [英] Use rand() to generate uniformly distributed floating point numbers on (a,b), [a,b), (a,b], and [a,b]

查看:288
本文介绍了使用rand()在(a,b),[a,b),(a,b]和[a,b]上生成均匀分布的浮点数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想收集一种最佳"方法来在一处针对所有四种类型的间隔生成随机数.我厌倦了谷歌搜索.搜索结果显示出很多废话.即使是相关的结果,也常常是页面或博客完全错误,或者进行了讨论,其中自任命的专家在某种技术上彼此意见相左,通常他们的答案"似乎暴露了他们对不同类型的知识不了解. ,开放,半开放)间隔.我讨厌阅读有关在C语言中针对此类简单"问题生成随机数的错误信息.

请教我如何生成均匀分布的浮点数.这是我在(a,b),[a,b),(a,b]和[a,b]上的典型方式(以"long double"为例)

long double a=VALUE1,b=VALUE2;
long double x1,x2,x3,x4;

srand((unsigned)time(NULL));

/* x1 will be an element of [a,b] */
x1=((long double)rand()/RAND_MAX)*(b-a) + a;

/* x2 will be an element of [a,b) */
x2=((long double)rand()/((long double)RAND_MAX+1))*(b-a) + a;

/* x3 will be an element of (a,b] */
x3=(((long double)rand()+1)/((long double)RAND_MAX+1))*(b-a) + a;

/* x4 will be an element of (a,b) */    
x4=(((long double)rand()+1)/((long double)RAND_MAX+2))*(b-a) + a;

对于单位间隔(0,1),[0,1),(0,1]和[0,1]的特殊情况:

long double x1,x2,x3,x4;

srand((unsigned)time(NULL));

/* x1 will be an element of [0,1] */
x1=((long double)rand()/RAND_MAX);

/* x2 will be an element of [0,1) */
x2=((long double)rand()/((long double)RAND_MAX+1));

/* x3 will be an element of (0,1] */
x3=(((long double)rand()+1)/((long double)RAND_MAX+1));

/* x4 will be an element of (0,1) */    
x4=(((long double)rand()+1)/((long double)RAND_MAX+2));

我认为必须对RAND_MAX和rand()的返回值进行强制转换,这不仅是因为我们要避免整数除法,而且因为它们是整数,否则加一个(或两个)可能会使它们溢出.

我认为"double"和"float"的版本完全相同,只是替换了类型.对于不同的浮点类型,是否有任何细微之处?

您发现上述实现有任何问题吗?如果是这样,您将如何解决该问题?

上面的实现通过了必要的测试,以确保它们正确(至少在运行64位Linux的64位Intel Core 2 Duo计算机上):x1可以生成0和1,x2可以生成0,但是hasn没有看到生成1,x3可以生成1,但没有看到生成0,没有看到x4生成0或1.

解决方案

如果希望范围内的每个double都可能,并且概率与其相邻double值之间的差成正比,那么实际上确实很难./p>

考虑范围[0, 1000].在范围的很小部分中有绝对的值存储量:01000000*DBL_MIN之间有一百万个值,而DBL_MIN大约为2 * 10 -308 .该范围内总共有超过2^32个值,因此很明显,仅调用一次rand()不足以生成所有这些值.您需要做的是均匀地生成双精度数的尾数,然后选择具有指数分布的指数,然后进行一些捏造以确保结果在范围内.

如果您要求范围内的每一双都可能,那么开放范围和封闭范围之间的差异就完全无关紧要,因为在真实"连续均匀随机分布中,概率的 any 确切值始终为0.因此,您最好只在开放范围内生成一个数字.

所有说明:是的,您提出的实现所生成的值在您说的范围内,对于封闭和半封闭范围,它们生成的端点概率为1/(RAND_MAX+1)左右.对于许多或大多数实际目的而言,这已经足够了.

只要RAND_MAX+2double可以准确表示的范围内,您就可以尝试+1和+2. IEEE双精度和32位int确实如此,但是C标准实际上并不能保证.

((我忽略了您对long double的使用,因为它会使事情有些混乱.保证至少与double一样大,但是在常见实现中,它与double完全一样,因此long除了不确定性之外不会添加任何内容.

I want to collect the "best" way to generate random numbers on all four types of intervals in one place. I'm sick of Googling this. Search results turn up a lot of crap. Even the relevant results are pages or blogs that are often flat-out wrong or have discussions where self-appointed experts disagree with each other over some technicality, often with their "answers" seemingly exposing that they do not know about the different types (closed, open, semi-open) of intervals. I'm sick of reading bad information about generating random numbers in C for such a "simple" question.

Please show me how to generate uniformly distributed floating point numbers. Here is my typical way (using "long double" as an example) on (a,b), [a,b), (a,b], and [a,b]:

long double a=VALUE1,b=VALUE2;
long double x1,x2,x3,x4;

srand((unsigned)time(NULL));

/* x1 will be an element of [a,b] */
x1=((long double)rand()/RAND_MAX)*(b-a) + a;

/* x2 will be an element of [a,b) */
x2=((long double)rand()/((long double)RAND_MAX+1))*(b-a) + a;

/* x3 will be an element of (a,b] */
x3=(((long double)rand()+1)/((long double)RAND_MAX+1))*(b-a) + a;

/* x4 will be an element of (a,b) */    
x4=(((long double)rand()+1)/((long double)RAND_MAX+2))*(b-a) + a;

For the special case of the unit intervals (0,1), [0,1), (0,1], and [0,1]:

long double x1,x2,x3,x4;

srand((unsigned)time(NULL));

/* x1 will be an element of [0,1] */
x1=((long double)rand()/RAND_MAX);

/* x2 will be an element of [0,1) */
x2=((long double)rand()/((long double)RAND_MAX+1));

/* x3 will be an element of (0,1] */
x3=(((long double)rand()+1)/((long double)RAND_MAX+1));

/* x4 will be an element of (0,1) */    
x4=(((long double)rand()+1)/((long double)RAND_MAX+2));

I believe the casts on both RAND_MAX and the return value of rand() are necessary, not only because we want to avoid integer division but because they are ints and otherwise adding one (or two) might overflow them.

I think that versions for "double" and "float" are exactly the same but just replacing the type. Are there any subtleties that arise for the different floating point types?

Do you see any problems with the above implementations? If so, what and how would you fix it?

EDIT: The above implementations pass necessary tests for them to be correct (at least on a 64-bit Intel Core 2 Duo machine running 64-bit Linux): x1 can generate both 0 and 1, x2 can generate 0 but hasn't been seen to generate 1, x3 can generate 1 but hasn't been seen to generate 0, and x4 hasn't been seen to generate either 0 or 1.

解决方案

If you want every double in the range to be possible, with probability proportional to the difference between it and its adjacent double values, then it's actually really hard.

Consider the range [0, 1000]. There are an absolute bucketload of values in the very tiny first part of the range: a million of them between 0 and 1000000*DBL_MIN, and DBL_MIN is about 2 * 10-308. There are more than 2^32 values in the range altogether, so clearly one call to rand() isn't enough to generate them all. What you'd need to do is generate the mantissa of your double uniformly, and select an exponent with an exponential distribution, and then fudge things a bit to ensure the result is in range.

If you don't require every double in the range to be possible, then the difference between open and closed ranges is fairly irrelevant, because in a "true" continuous uniform random distribution, the probability of any exact value occurring is 0 anyway. So you might as well just generate a number in the open range.

All that said: yes, your proposed implementations generate values that are in the ranges you say, and for the closed and half-closed ranges they generate the end-points with probability 1/(RAND_MAX+1) or so. That's good enough for many or most practical purposes.

Your fiddling around with +1 and +2 works provided that RAND_MAX+2 is within the range that double can represent exactly. This is true for IEEE double precision and 32 bit int, but it's not actually guaranteed by the C standard.

(I'm ignoring your use of long double because it confuses things a bit. It's guaranteed to be at least as big as double, but there are common implementations in which it's exactly the same as double, so the long doesn't add anything except uncertainty).

这篇关于使用rand()在(a,b),[a,b),(a,b]和[a,b]上生成均匀分布的浮点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆