获取两个值之和等于给定数字的n个不同的随机数 [英] Get n distinct random numbers between two values whose sum is equal to a given number

查看:90
本文介绍了获取两个值之和等于给定数字的n个不同的随机数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在一个总和等于给定数字的范围内找到不同的随机数.

I would like to find distinct random numbers within a range that sums up to given number.

注意:我在stackoverflow中发现了类似的问题,但是它们并不能完全解决该问题(即,它们不考虑该范围的负LowerLimit).

Note: I found similar questions in stackoverflow, however they do not address exactly this problem (ie they do not consider a negative lowerLimit for the range).

如果我希望我的随机数的总和等于1,我只是生成所需的随机数,计算总和并将它们除以总和.但是在这里我需要一些不同的东西.我将需要我的随机数加起来而不是1,但我的随机数仍必须在给定范围内.

If I wanted that the sum of my random number was equal to 1 I just generate the required random numbers, compute the sum and divided each of them by the sum; however here I need something a bit different; I will need my random numbers to add up to something different than 1 and still my random numbers must be within a given range.

示例:我需要在-50到50之间的30个不同的随机数(非整数),其中30个生成的数之和必须等于300;我在下面编写了代码,但是当n远远大于范围(upperLimit-lowerLimit)时,它将不起作用,该函数可能会返回[lowerLimit-upperLimit]范围以外的数字.改善当前解决方案有帮助吗?

Example: I need 30 distinct random numbers (non integers) between -50 and 50 where the sum of the 30 generated numbers must be equal to 300; I wrote the code below, however it will not work when n is much larger than the range (upperLimit - lowerLimit), the function could return numbers outside the range [lowerLimit - upperLimit]. Any help to improve the current solution?

static void Main(string[] args)
{
    var listWeights = GetRandomNumbersWithConstraints(30, 50, -50, 300);
}

private static List<double> GetRandomNumbersWithConstraints(int n, int upperLimit, int lowerLimit, int sum)
{
    if (upperLimit <= lowerLimit || n < 1)
        throw new ArgumentOutOfRangeException();

    Random rand = new Random(Guid.NewGuid().GetHashCode());
    List<double> weight = new List<double>();

    for (int k = 0; k < n; k++)
    {
        //multiply by rand.NextDouble() to avoid duplicates
        double temp = (double)rand.Next(lowerLimit, upperLimit) * rand.NextDouble();

        if (weight.Contains(temp))
            k--;
        else
            weight.Add(temp);
    }

    //divide each element by the sum
    weight = weight.ConvertAll<double>(x => x / weight.Sum());  //here the sum of my weight will be 1 

    return weight.ConvertAll<double>(x => x * sum);
}

编辑-进行澄清

运行当前代码将生成以下30个数字,总计为300.但是这些数字不在-50和50之间

Running the current code will generate the following 30 numbers that add up to 300. However those numbers are not within -50 and 50

-4.425315699
67.70219958
82.08592061
46.54014109
71.20352208
-9.554070146
37.65032717
-75.77280868
24.68786878
30.89874589
142.0796933
-1.964407284
9.831226893
-15.21652248
6.479463312
49.61283063
118.1853036
-28.35462683
49.82661159
-65.82706541
-29.6865969
-54.5134262
-56.04708803
-84.63783048
-3.18402453
-13.97935982
-44.54265204
112.774348
-2.911427266
-58.94098071

推荐答案

好,这是怎么做的

我们将使用 Dirichlet分布,它是随机数x i的分布在[0 ... 1]范围内,这样

We will use Dirichlet Distribution, which is distribution for random numbers xi in the range [0...1] such that

Sum i x i = 1

Sumi xi = 1

因此,在线性和比例缩放后,总和将自动得到满足. Dirichlet分布由α i 参数化,但是我们假定所有RN都来自相同的边际分布,因此只有一个参数α每个索引.

So, after linear rescaling condition for sum would be satisfied automatically. Dirichlet distribution is parametrized by αi, but we assume all RN to be from the same marginal distribution, so there is only one parameter α for each and every index.

对于较大的α值,采样随机数的平均值将为= 1/n,方差为〜1/(n *α),因此,较大的α导致随机值更接近均值.

For reasonable large value of α, mean value of sampled random numbers would be =1/n, and variance ~1/(n * α), so larger α lead to random value more close to the mean.

好,现在回到重新缩放,

Ok, now back to rescaling,

v i = A + B * x i

vi = A + B*xi

我们必须得到AB.正如@HansKe st ing正确指出的那样,只有两个自由参数,我们只能满足两个约束,但是您有三个约束.因此,我们将严格满足下界约束,总和值约束,但偶尔会违反上限约束.在这种情况下,我们将整个样本扔掉,然后再做一个.

And we have to get A and B. As @HansKesting rightfully noted, with only two free parameters we could satisfy only two constraints, but you have three. So we would strictly satisfy low bound constraint, sum value constraint, but occasionally violate upper bound constraint. In such case we just throw whole sample away and do another one.

同样,我们有一个旋钮可以转动α变大意味着我们接近均值,并且不太可能达到上限.与α = 1我很少获得任何好的样本,但是使用α = 10我接近40%的好样本.与α = 16我接近80%的好样本.

Again, we have a knob to turn, α getting larger means we are close to mean values and less likely to hit upper bound. With α = 1 I'm rarely getting any good sample, but with α = 10 I'm getting close to 40% of good samples. With α = 16 I'm getting close to 80% of good samples.

使用来自 MathDotNet 的代码,通过Gamma分布进行狄利克雷采样.

Dirichlet sampling is done via Gamma distribution, using code from MathDotNet.

代码,已通过.NET Core 2.1测试

Code, tested with .NET Core 2.1

using System;

using MathNet.Numerics.Distributions;
using MathNet.Numerics.Random;

class Program
{
    static void SampleDirichlet(double alpha, double[] rn)
    {
        if (rn == null)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is null");

        if (alpha <= 0.0)
            throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");

        int n = rn.Length;
        if (n == 0)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");

        var gamma = new Gamma(alpha, 1.0);

        double sum = 0.0;
        for(int k = 0; k != n; ++k) {
            double v = gamma.Sample();
            sum  += v;
            rn[k] = v;
        }

        if (sum <= 0.0)
            throw new ApplicationException($"SampleDirichlet:: sum {sum} is non-positive");

        // normalize
        sum = 1.0 / sum;
        for(int k = 0; k != n; ++k) {
            rn[k] *= sum;
        }
    }

    static bool SampleBoundedDirichlet(double alpha, double sum, double lo, double hi, double[] rn)
    {
        if (rn == null)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is null");

        if (alpha <= 0.0)
            throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");

        if (lo >= hi)
            throw new ArgumentException($"SampleDirichlet:: low {lo} is larger than high {hi}");

        int n = rn.Length;
        if (n == 0)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");

        double mean = sum / (double)n;
        if (mean < lo || mean > hi)
            throw new ArgumentException($"SampleDirichlet:: mean value {mean} is not within [{lo}...{hi}] range");

        SampleDirichlet(alpha, rn);

        bool rc = true;
        for(int k = 0; k != n; ++k) {
            double v = lo + (mean - lo)*(double)n * rn[k];
            if (v > hi)
                rc = false;
            rn[k] = v;
        }
        return rc;
    }

    static void Main(string[] args)
    {
        double[] rn = new double [30];

        double lo = -50.0;
        double hi =  50.0;

        double alpha = 10.0;

        double sum = 300.0;

        for(int k = 0; k != 1_000; ++k) {
            var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
            Console.WriteLine($"Rng(BD), v = {q}");
            double s = 0.0;
            foreach(var r in rn) {
                Console.WriteLine($"Rng(BD),     r = {r}");
                s += r;
            }
            Console.WriteLine($"Rng(BD),    summa = {s}");
        }
    }
}

更新

通常,当人们提出这样的问题时,会有一个隐含的假设/要求-所有随机数都应以相同的方式分配.这意味着,如果我从采样数组中为索引为0的项绘制边际概率密度函数(PDF),则将获得与为数组中的最后一项项绘制边际概率密度函数的分布相同的分布.人们通常对随机数组进行采样,以将其传递给其他例程以执行一些有趣的工作.如果项目0的边际PDF与最后索引的项目的边际PDF不同,则仅返回数组将产生使用这种随机值的代码而产生完全不同的结果.

Usually, when people ask such question, there is an implicit assumption/requirement - all random numbers shall be distribution in the same way. It means that if I draw marginal probability density function (PDF) for item indexed 0 from the sampled array, I shall get the same distribution as I draw marginal probability density function for the last item in the array. People usually sample random arrays to pass it down to other routines to do some interesting stuff. If marginal PDF for item 0 is different from marginal PDF for last indexed item, then just reverting array will produce wildly different result with the code which uses such random values.

在这里,我使用采样例程绘制了原始条件([-50 ... 50] sum = 300)的项目0和最后一项(#29)的随机数分布.看起来很相似,不是吗?

Here I plotted distributions of random numbers for item 0 and last item (#29) for original conditions([-50...50] sum=300), using my sampling routine. Look similar, isn't it?

好的,这是您的采样程序的图片,相同的原始条件([-50 ... 50] sum = 300),相同的采样数

Ok, here is a picture from your sampling routine, same original conditions([-50...50] sum=300), same number of samples

UPDATE II

UPDATE II

用户应该检查采样例程的返回值,并在(且仅)返回值为true时接受并使用采样数组.这是接受/拒绝方法.作为说明,下面是用于直方图样本的代码:

User supposed to check return value of the sampling routine and accept and use sampled array if (and only if) return value is true. This is acceptance/rejection method. As an illustration, below is code used to histogram samples:

        int[] hh = new int[100]; // histogram allocated

        var s = 1.0; // step size
        int k = 0;   // good samples counter
        for( ;; ) {
            var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
            if (q) // good sample, accept it
            {
                var v = rn[0]; // any index, 0 or 29 or ....
                var i = (int)((v - lo) / s);
                i = System.Math.Max(i, 0);
                i = System.Math.Min(i, hh.Length-1);
                hh[i] += 1;

                ++k;
                if (k == 100000) // required number of good samples reached
                    break;
            }
        }
        for(k = 0; k != hh.Length; ++k)
        {
            var x = lo + (double)k * s + 0.5*s;
            var v = hh[k];
            Console.WriteLine($"{x}     {v}");
        }

这篇关于获取两个值之和等于给定数字的n个不同的随机数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆