产生均匀的随机整数,一定的最大 [英] Generating uniform random integers with a certain maximum

查看:118
本文介绍了产生均匀的随机整数,一定的最大的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要生成满足均匀整数 0℃=结果<。= maxValue(最大值)

I want to generate uniform integers that satisfy 0 <= result <= maxValue.

我已经有一台发电机,在全系列内置的无符号整数类型返回均匀值。让我们来调用方法,该字节字节() USHORT UINT16() UINT UInt32的() ULONG UINT64()。假定这些方法的结果是完美的均匀

I already have a generator that returns uniform values in the full range of the built in unsigned integer types. Let's call the methods for this byte Byte(), ushort UInt16(), uint UInt32() and ulong UInt64(). Assume that the result of these methods is perfectly uniform.

我想是方法的签名 UINT UniformUInt(UINT包括maxValue) ULONG UniformUInt(ULONG包括maxValue)

我正在寻找的:

  1. 正确性
    编号preFER返回值被分布在给定的时间间隔。
    但是,的非常的小偏差是可以接受的,如果它显著提高性能。我的意思是为了允许区分器的概率是2/3给2 ^ 64个值的偏差。
    它必须正确地适用于任何 maxValue(最大值)
  2. 性能
    该方法要快。
  3. 效率
    该方法会消耗少生随机性,因为这取决于底层的发电机上,产生的原始字节可能是昂贵的。浪费了数位是好的,但耗时说128位,以产生一个单一的数字可能是过度的。
  1. Correctness
    I'd prefer the return values to be distributed in the given interval.
    But a very small bias is acceptable if it increases performance significantly. By that I mean a bias of an order that allows distinguisher with probability 2/3 given 2^64 values.
    It must work correctly for any maxValue.
  2. Performance
    The method should be fast.
  3. Efficiency
    The method does consume little raw randomness, since depending on the underlying generator, generating the raw bytes might be costly. Wasting a few bits is fine, but consuming say 128 bits to generate a single number is probably excessive.

这也可以缓存一些遗留下来的一些成员变量previous电话随机性。

It's also possible to cache some left over randomness from the previous call in some member variables.

小心INT溢出和包装的行为。

Be careful with int overflows, and wrapping behavior.

我已经有一个解决方案(我会后它作为一个答案),但它是一个有点难看,我的口味。所以,我想获得的想法更好的解决方案。

I already have a solution(I'll post it as an answer), but it's a bit ugly for my tastes. So I'd like to get ideas for better solutions.

在建议如何单元测试与大型 maxValue(最大值) s就很好过,因为我无法生成与2 ^ 64桶和2 ^ 74的随机值直方图。另一个复杂之处在于具有一定的缺陷,只有一些 maxValue(最大值)分布偏向了很多,和其他人也非常小。

Suggestions on how to unit test with large maxValues would be nice too, since I can't generate a histogram with 2^64 buckets and 2^74 random values. Another complication is that with certain bugs, only some maxValue distributions are biased a lot, and others only very slightly.

推荐答案

如何这样的事情作为一个通用的解决方案吗?该算法是基于所使用的 Java的 nextInt 方法,拒绝这将导致非均匀分布的任何值。只要你的 UInt32的方法的输出是完全一致的话,这应该是也。

How about something like this as a general-purpose solution? The algorithm is based on that used by Java's nextInt method, rejecting any values that would cause a non-uniform distribution. So long as the output of your UInt32 method is perfectly uniform then this should be too.

uint UniformUInt(uint inclusiveMaxValue)
{
    unchecked
    {
        uint exclusiveMaxValue = inclusiveMaxValue + 1;

        // if exclusiveMaxValue is a power of two then we can just use a mask
        // also handles the edge case where inclusiveMaxValue is uint.MaxValue
        if ((exclusiveMaxValue & (~exclusiveMaxValue + 1)) == exclusiveMaxValue)
            return UInt32() & inclusiveMaxValue;

        uint bits, val;
        do
        {
            bits = UInt32();
            val = bits % exclusiveMaxValue;

            // if (bits - val + inclusiveMaxValue) overflows then val has been
            // taken from an incomplete chunk at the end of the range of bits
            // in that case we reject it and loop again
        } while (bits - val + inclusiveMaxValue < inclusiveMaxValue);

        return val;
    }
}

排斥过程可能,理论上,不断循环,直到永远。在实践中的表现应该是pretty的好。这很难提出任何普遍适用的优化而无需知道的(一)的预期的使用模式,并且(B)的你的底层的RNG的性能特点。

The rejection process could, theoretically, keep looping forever; in practice the performance should be pretty good. It's difficult to suggest any generally applicable optimisations without knowing (a) the expected usage patterns, and (b) the performance characteristics of your underlying RNG.

例如,如果大多数听众将指定一个最大值小于= 255,那么它可能没有意义,要求每次四个字节随机。在另一方面,要求较少的字节的性能优势可能会被始终检查你实际需要多少额外的费用来抵消。 (当然,一旦你的执行的有具体的信息,那么你可以不断优化和测试,直到你的成绩不够好。)

For example, if most callers will be specifying a max value <= 255 then it might not make sense to ask for four bytes of randomness every time. On the other hand, the performance benefit of requesting fewer bytes might be outweighed by the additional cost of always checking how many you actually need. (And, of course, once you do have specific information then you can keep optimising and testing until your results are good enough.)

这篇关于产生均匀的随机整数,一定的最大的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆