从一个范围到另一个范围的均匀分布的随机数的节俭转换 [英] Frugal conversion of uniformly distributed random numbers from one range to another

查看:69
本文介绍了从一个范围到另一个范围的均匀分布的随机数的节俭转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种方法可以节俭将一个范围的均匀分布的随机数转换为另一范围的均匀分布的随机数?

让我解释一下"节俭"的意思.

生成给定范围内的随机数(例如r∈[0..10))的典型方法是采用一些固定的随机位,例如31,这将导致非负随机数小于2147483648.然后确保值小于2147483640(因为2147483648不能被10整除,因此可能导致分布不均).如果该值大于或等于2147483640,则将其丢弃并重试(获取下31个随机位,依此类推).如果值小于2147483640,则仅返回除以10的余数.此方法每十进制数字至少消耗31位.由于理论极限是log 2 (10)= 3.321928 ...,所以非常浪费.

如果使用4位代替31,我们可以改善这一点.在这种情况下,每十进制数字将消耗4×1.6 = 6.4位.这虽然更省钱,但仍远非理想.

  public int nextDit(){结果做 {结果= next4Bits();} while(结果> = 10);返回结果;} 

我们可以尝试一次生成3个十进制数字.由于1024非常接近1000,因此拒绝原始原始随机数的可能性小于以前的情况.生成3个十进制数字后,我们将返回1个数字并保留其余2个数字.

类似下面的内容

  private int _decDigits = 0;private int _decCount = 0;public int nextDit(){如果(_decCount> 0){//从储备金中提取数字int结果= _decDigits%10;_decDigits/= 10;_decCount-= 1;返回结果;} 别的 {结果做 {结果= next10Bits();} while(结果> = 1000);//保留2个十进制数字_decCount = 2;_decDigits =结果%100;结果/= 100;返回结果;}} 

这种方法更省钱:每十进制数字消耗10×1.024/3 = 3.41(3)位.

如果我们尝试重用以前丢弃的数字,我们甚至可以走得更远.随机数r∈[0,1024)属于以下三个范围之一:[0,1000),[1000,1020),[1020,1024).

如果它属于[0,1000],我们将像以前一样进行操作,保留2个十进制数字(保留十进制数字)并返回1个十进制数字.

如果它落在[1000,1020)中,我们减去1000转换为范围[0,20).然后,通过将其除以10得到1位,并通过将除数的余数除以10得到1个十进制数字.将其放入二进制数保留空间并返回十进制数字.

如果它落入[1020,1024),我们减去1020转换为范围[0,4).在这里,我们只得到2位,并存入二进制数字的储备中.

 //保留十进制数字private int _decDigits = 0;private int _decCount = 0;//二进制数字保留private int _binDigits = 0;private int _binCount = 0;私人int nextBits(int bits,int n){对于(int i = 0; i< n; i + = 1){位=(位<<< 1)+ _bitRandomDevice.nextBit();}返回位;}私人诠释next10Bits(){//首先从二进制保留区中获取位,然后从_bitRandomDevice中获取位结果如果(_binCount> = 10){结果= _binDigits>>(_binCount-10);_binDigits = _binDigits&(1<<(_binCount-10)-1);_binCount-= 10;} 别的 {结果= nextBits(_binDigits,10-_binCount);_binCount = 0;_binDigits = 0;}返回结果;}public int nextDit(){如果(_decCount> 0){//从小数点后保留中取数字int结果= _decDigits%10;_decDigits/= 10;_decCount-= 1;返回结果;} 别的 {结果而(true){结果= next10Bits();如果(结果< 1000){断言结果> = 0&&结果<1000;//保留2个十进制数字_decCount = 2;_decDigits =结果%100;结果/= 100;//返回1个十进制数字返回结果;}否则,如果(结果< 1020){结果-= 1000;断言结果> = 0&&结果<20;//保留1个二进制数字_binCount + = 1;_binDigits =(_binDigits<< 1)+(结果/10);//返回1个十进制数字返回结果%10;} 别的 {结果-= 1020;断言结果> = 0&&结果<4;//保留2个二进制数字_binCount + = 2;_binDigits =(_binDigits<< 2)+结果;}}}} 

此方法每十进制数字消耗约3.38 ...位.这是我能找到的最节约的方法,但它仍然浪费/丢失了随机性来源的一些信息.

因此,我的问题是:是否存在任何通用方法/算法将一个任意范围[0,s]的均匀分布随机数(后来称为源数)转换为另一个任意范围[0,t的均匀分布的随机数)(后来称为目标编号),每个目标编号仅消耗log s (t)+ C源代码?其中C是一些常数.如果没有这种方法,为什么呢?是什么导致无法达到理想极限?

节俭的目的是减少对RNG的呼叫次数.当我们使用True RNG时,这尤其值得做,因为True RNG的吞吐量通常很有限.

对于节俭优化",它们基于以下假设:

    在检查 r <M(如果M< = N),则可以假定它均匀地分布在[0,M]中.传统的拒绝方法实际上是基于此假设的.同样,在检查 r > = M 之后,我们可以假定它均匀地分布在[M,N)中.给定统一随机数r∈[A,B),导出的随机数(r + C)均匀分布在[A + C,B + C)中.IE.我们可以对随机数添加或减去任何常量以改变其范围.
  • 给定均匀随机数r∈[0,N),其中N = P×Q,导出的随机数(r%P)均匀地分布在[0,P)中,而(r/P)均匀地分布在[0,P]中[0,Q).IE.我们可以将一个统一的随机数分成几个.
  • 在给定统一随机数p∈[0,P)和q∈[0,Q)的情况下,导出的随机数(q×P + p)均匀分布在[0,P×Q)中.IE.我们可以将统一的随机数组合成一个.

解决方案

您的目标最终是仅给定 p 侧边的骰子,滚动一个 k 侧边的骰子,不会浪费随机性.

从这个意义上讲,引理3在"模拟中带有骰子的骰子"除非由"B. Kloeckner"所著,否则除非每个素数除以 k 也除以 p ",否则这种浪费是不可避免的.因此,例如,如果 p 是2的幂(并且任何随机位块都与以2个数量的面的幂滚动骰子相同)和 k 的主要因素不是2,最好的办法是任意获取,几乎不会浪费随机性.

此外,除了批处理以减少位浪费"外,(另请参见数学论坛),还有随机抽取,在 Devroye and Gravel 2015-2020 和我的关于随机抽取的说明.

另请参阅问题: public int nextDit() { int result; do { result = next4Bits(); } while (result >= 10); return result; }

We can try to generate 3 decimal digits at once. Since 1024 is quite close to 1000, the probability that raw source random number will be rejected is less than in previous case. Once we generated 3 decimal digits, we return 1 digit and reserve the rest 2 digits.

Something like below

    private int _decDigits = 0;
    private int _decCount = 0;

    public int nextDit() {
        if (_decCount > 0) {
            // take numbers from the reserve
            int result = _decDigits % 10;
            _decDigits /= 10;
            _decCount -= 1;
            return result;
        } else {
            int result;
            do {
                result = next10Bits();
            } while (result >= 1000);
            // reserve 2 decimal digits
            _decCount = 2;
            _decDigits = result % 100;
            result /= 100;
            return result;
        }
    }

This approach is much more frugal: it consumes 10 × 1.024 / 3 = 3.41(3) bits per decimal digit.

We can even go farther if we try to reuse the numbers, which we previously have been throwing away. The random number r ∈ [0, 1024) falls into one of the 3 ranges: [0, 1000), [1000, 1020), [1020, 1024).

If it falls into [0, 1000), we do as we did before, reserve 2 decimal digits (in decimal digit reserve) and return 1 decimal digit.

If it falls into [1000, 1020), we subtract 1000 converting to the range [0, 20). Then we get 1 bit by dividing it by 10 and 1 decimal digit by getting remainder of division by 10. We put the bit to the binary digit reserve and return the decimal digit.

If it falls into [1020, 1024), we subtract 1020 converting to the range [0, 4). Here we get just 2 bits, which we put to the binary digits reserve.

    // decimal digit reserve
    private int _decDigits = 0;
    private int _decCount = 0;
    // binary digit reserve
    private int _binDigits = 0;
    private int _binCount = 0;

    private int nextBits(int bits, int n) {
        for (int i = 0; i < n; i += 1) {
            bits = (bits << 1) + _bitRandomDevice.nextBit();
        }
        return bits;
    }

    private int next10Bits() {
        // take bits from the binary reserve first, then from _bitRandomDevice
        int result;
        if (_binCount >= 10) {
            result = _binDigits >> (_binCount - 10);
            _binDigits = _binDigits & (1 << (_binCount - 10) - 1);
            _binCount -= 10;
        } else {
            result = nextBits(_binDigits, 10 - _binCount);
            _binCount = 0;
            _binDigits = 0;
        }
        return result;
    }

    public int nextDit() {
        if (_decCount > 0) {
            // take numbers from the decimal reserve
            int result = _decDigits % 10;
            _decDigits /= 10;
            _decCount -= 1;
            return result;
        } else {
            int result;
            while (true) {
                result = next10Bits();
                if (result < 1000) {
                    assert result >= 0 && result < 1000;
                    // reserve 2 decimal digits
                    _decCount = 2;
                    _decDigits = result % 100;
                    result /= 100;
                    // return 1 decimal digit
                    return result;
                } else if (result < 1020) {
                    result -= 1000;
                    assert result >= 0 && result < 20;
                    // reserve 1 binary digit
                    _binCount += 1;
                    _binDigits = (_binDigits << 1) + (result / 10);
                    // return 1 decimal digit
                    return result % 10;
                } else {
                    result -= 1020;
                    assert result >= 0 && result < 4;
                    // reserve 2 binary digits
                    _binCount += 2;
                    _binDigits = (_binDigits << 2) + result;
                }
            }
        }
    }

This approach consumes about 3.38... bits per decimal digit. This is the most frugal approach I can find, but it still wastes/loses some information from the source of randomness.

Thus, my question is: Is there any universal approach/algorithm that converts uniformly distributed random numbers of one arbitrary range [0, s) (later called source numbers) to uniformly distributed random numbers of another arbitrary range [0, t) (later called target numbers), consuming only logs(t) + C source numbers per target number? where C is some constant. If there is no such approach, why? What prevents from reaching the ideal limit?

The purpose of being frugal is to reduce number of calls to RNG. This could be especially worth to do when we work with True RNG, which often has limited throughput.

As for "frugality optimizations", they are based on following assumptions:

  • given uniform random number r ∈ [0,N), after checking that r < M (if M <= N), we may assume that it's uniformly distributed in [0,M). Traditional rejection approach is actually based on this assumption. Similarly, after checking that r >= M, we may assume that it's uniformly distributed in [M,N).
  • given uniform random number r ∈ [A,B), the derived random number (r+C) is uniformly distributed in [A+C,B+C). I.e. we can add and subtract any constant to random number to shift its range.
  • given uniform random number r ∈ [0,N), where N=P × Q, the derived random numbers (r%P) is uniformly distributed in [0,P) and (r/P) is uniformly distributed in [0,Q). I.e. we can split one uniform random number into several ones.
  • given uniform random numbers p ∈ [0,P) and q ∈ [0,Q), the derived random number (q× P + p) is uniformly distributed in [0,P × Q). I.e. we can combine uniform random numbers into one.

解决方案

Your goal is ultimately to roll a k-sided die given only a p-sided die, without wasting randomness.

In this sense, by Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner, this waste is inevitable unless "every prime number dividing k also divides p". Thus, for example, if p is a power of 2 (and any block of random bits is the same as rolling a die with a power of 2 number of faces) and k has prime factors other than 2, the best you can do is get arbitrarily close to no waste of randomness.

Also, besides batching of bits to reduce "bit waste" (see also the Math Forum), there is also the technique of randomness extraction, discussed in Devroye and Gravel 2015-2020 and in my Note on Randomness Extraction.

See also the question: How to generate a random integer in the range [0,n] from a stream of random bits without wasting bits?, especially my answer there.

这篇关于从一个范围到另一个范围的均匀分布的随机数的节俭转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆