C ++ uniform_int_distribution总是在第一次调用时返回min() [英] C++ uniform_int_distribution always returning min() on first invocation

查看:322
本文介绍了C ++ uniform_int_distribution总是在第一次调用时返回min()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在标准库的至少一个实现中,第一次调用 std :: uniform_int_distribution 不会返回随机值,而是分布的最小值。也就是说,给定代码:

In at least one implementation of the standard library, the first invocation of a std::uniform_int_distribution<> does not return a random value, but rather the distribution's min value. That is, given the code:

default_random_engine engine( any_seed() );
uniform_int_distribution< int > distribution( smaller, larger );
auto x = distribution( engine );
assert( x == smaller );

... x 对于 any_seed()较小的任何值或较大

...x will in fact be smaller for any values of any_seed(), smaller, or larger.

要在家中玩,可以尝试

To play along at home, you can try a code sample that demonstrates this problem in gcc 4.8.1.

我相信这是不是 正确的行为?如果是正确的行为,为什么随机分布会返回这个明显的非随机值?

I trust this is not correct behavior? If it is correct behavior, why would a random distribution return this clearly non-random value?

推荐答案

/ h1>

这是如何 uniform_int_distribution 将随机位映射到数字,如果可能结果的范围小于数字的范围rng产生:

Explanation for the observed behavior

This is how uniform_int_distribution maps the random bits to numbers if the range of possible outcomes is smaller than the range of number the rng produces:

const __uctype __uerange = __urange + 1; // __urange can be zero
const __uctype __scaling = __urngrange / __uerange;
const __uctype __past = __uerange * __scaling;
do
  __ret = __uctype(__urng()) - __urngmin;
while (__ret >= __past);
__ret /= __scaling;

其中 __ urange 更大 - 更小 __ urngrange 是rng可以返回的最大值和最小值之间的差值。 (libstdc ++ 6.1中bits / uniform_int_dist.h的代码)

where __urange is larger - smaller and __urngrange is the difference between the maximum and the minimum value the rng can return. (Code from bits/uniform_int_dist.h in libstdc++ 6.1)

在我们的例子中,rng default_random_engine code> minstd_rand0 ,它为您测试的范围[0,10]产生 __ scaling == 195225785 。因此,如果 rng() 195225785 ,分配将返回0。

In our case, the rng default_random_engine is a minstd_rand0, which yields __scaling == 195225785 for the range [0,10] you tested with. Thus, if rng() < 195225785, the distribution will return 0.

第一个数字a minstd_rand0

(16807 * seed) % 2147483647

(其中 seed == 0 调整为 1 btw)。因此,我们可以看到由 minstd_rand0 生成的第一个值以小于11615的数产生,并且 uniform_int_distribution < int>分布(0,10); 您使用。 (mod我的一部分错误;))

(where seed == 0 gets adjusted to 1 btw). We can thus see that the first value produced by a minstd_rand0 seeded with a number smaller than 11615 will yield 0 with the uniform_int_distribution< int > distribution( 0, 10 ); you used. (mod off-by-one-errors on my part. ;) )

你提到了更大的种子的问题:一旦种子足够大实际上使得mod操作做某事,我们不能简单地通过除法将相同值的整个范围分配给相同的输出,因此结果将更好。

You mentioned the problem going away for bigger seeds: As soon as the seeds get big enough to actually make the mod operation do something, we cannot simply assign a whole range of values to the same output by division, so the results will look better.

否。你引入了显着的偏差在什么是应该是一个随机的32位种子,总是选择它小。在结果中出现的偏见并不令人惊讶或邪恶。对于随机种子,即使您的 minstd_rand0 将产生一个相当均匀随机的第一个值。

No. You introduced significant bias in what is supposed to be a random 32 bit seed by always choosing it small. That bias showing up in the results is not surprising or evil. For random seeds, even your minstd_rand0 will yield a fairly uniformly random first value. (Though the sequence of numbers after that will not be of great statistical quality.)

案例1:您希望获得高质量的随机数字。

Case 1: You want random number of high statistical quality.

$ c> mt19937 并种下其整个状态空间。对于Mersenne Twister,这是624个32位整数。 (对于参考,此处是我尝试正确执行此操作的一些有用的建议)

For that, you use a better rng like mt19937 and seed its entire state space. For the Mersenne Twister, that's 624 32-bit integers. (For reference, here is my attempt to do this properly with some helpful suggestions in the answer.)

情况2:您确实只想使用这些小种子。

Case 2: You really want to use those small seeds only.

我们仍然可以得到体面的结果。问题是伪随机数发生器通常依赖于有点连续在它们的种子上。为了绕过这个,我们舍弃足够的数字,让最初相似的输出序列发散。所以如果你的种子必须很小,你可以这样初始化你的rng:

We can still get decent results out of this. The problem is that pseudo random number generators commonly depend "somewhat continuously" on their seed. To ship around this, we discard enough numbers to let the initially similar sequences of output diverge. So if your seed must be small, you can initialize your rng like this:

std::mt19937 rng(smallSeed);
rng.discard(700000);

这是至关重要的,你使用像Mersenne Twister这样的好人。我不知道有什么方法可以从糟糕的种子 minstd_rand0 中得到合理的值,例如参见 this train-wreck 。即使正确播种, mt19937 的统计属性也是优越的。

It is vital that you use a good rng like the Mersenne Twister for this. I do not know of any method to get even decent values out of a poorly seeded minstd_rand0, for example see this train-wreck. Even if seeded properly, the statistical properties of a mt19937 are superior by far.

关于大状态空间或缓慢生成你有时听到的通常是没有关心的嵌入式世界之外。根据提升 cacert.at ,MT甚至比 minstd_rand0更快

Concerns about the large state space or slow generation you sometimes hear about are usually of no concern outside the embedded world. According to boost and cacert.at, the MT is even way faster than minstd_rand0.

你仍然需要做丢弃手法,即使你的结果看起来不错。它在我的系统上需要不到一毫秒,并且你不经常播种rng,所以没有理由不。

You still need to do the discard trick though, even if your results look good to the naked eye without. It takes less than a millisecond on my system, and you don't seed rngs very often, so there is no reason not to.

请注意,我不能给出对我们需要的丢弃物数量的大致估计,我从此回答中获取该值,它链接< a href =http://www.iro.umontreal.ca/~lecuyer/myftp/papers/lfsr04.pdf =nofollow>本文为一个理性。我现在没有时间工作。

Note that I am not able to give you a sharp estimate for the number of discards we need, I took that value from this answer, it links this paper for a rational. I don't have the time to work through that right now.

这篇关于C ++ uniform_int_distribution总是在第一次调用时返回min()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆