在CUDA内核中的均匀分布的伪随机整数 [英] Uniformly distributed pseudorandom integers inside CUDA kernel
问题描述
如何在内核中生成均匀分布的伪随机整数?据我所知,Curand Api允许使用泊松离散分布,但不一致。
:
1)使用curand_uniform从均匀分布中获取随机浮点数,然后将其映射到整数区间:
float randu_f = curand_uniform(& localState);
randu_f * =(B-A + 0.999999); //你不应该使用(B-A + 1)*
randu_f + = A;
int randu_int = __float2int_rz(randu_f);
__ float2int_rz将舍入到零模式中的单精度浮点值x转换为有符号整数。
* curand_uniform返回在0.0和1.0之间均匀分布的伪随机浮点数序列。它可以从0.0返回到1.0,其中包括1.0并且排除0.0。
您应该使用largest_float_before_1或小于1,因为有一个小机会你将随机1,并且你可以超出范围。我也没有检查是否largest_float_before_1和GPU上的浮点运算保证不超过定义的边界。
2)调用curand返回一个伪随机数序列: / p>
int randu_int = A + curand(& localState)%(BA);但是,模数在GPU上非常昂贵,方法1更快。
$ b < bHow can I generate uniformly distributed pseudorandom integers within a kernel? As far as I know Curand Api allows to use poisson discrete distribution, but not uniform.
解决方案 I suggest two options within a Kernel:
1) using curand_uniform to obtain a random floating point number from a uniform distribution, then map it to integer interval:
float randu_f = curand_uniform(&localState);
randu_f *= (B-A+0.999999); // You should not use (B-A+1)*
randu_f += A;
int randu_int = __float2int_rz(randu_f);
__float2int_rz Convert the single-precision floating point value x to a signed integer in round-towards-zero mode.
*curand_uniform returns a sequence of pseudorandom floats uniformly distributed between 0.0 and 1.0. It may return from 0.0 to 1.0, where 1.0 is included and 0.0 is excluded.
You should use biggest_float_before_1 or a little less than 1, because there is a small chance You will random 1, and You can get out of bounds. I didn't also check does biggest_float_before_1 and floating-point operations on GPU guarantee not to exceed from defined bounds.
2) calling curand returns a sequence of pseudorandom numbers:
int randu_int = A + curand(&localState) % (B-A);
However, modulo is very expensive on GPU and method 1 is faster.
这篇关于在CUDA内核中的均匀分布的伪随机整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!