生成总和为 M 的 N 个统一随机数 [英] Generating N uniform random numbers that sum to M

查看:36
本文介绍了生成总和为 M 的 N 个统一随机数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题以前有人问过,但我从来没有真正看到好的答案.

This question has been asked before, but I've never really seen a good answer.

  1. 我想生成 8 个总和为 0.5 的随机数.

  1. I want to generate 8 random numbers that sum to 0.5.

我希望从均匀分布中随机选择每个数字(即,下面的简单函数将不起作用,因为这些数字不是均匀分布的).

I want each number to be randomly chosen from a uniform distribution (ie, the simple function below will not work because the numbers will not be uniformly distributed).

def rand_constrained(n,tot):
    r = [random.random() for i in range(n)]  
    s = sum(r)
    r = [(i/s*tot) for i in r] 
    return r

代码应该是可泛化的,这样您就可以生成总和为 M 的 N 个统一随机数(其中 M 是一个正浮点数).如果可能,您能否还解释(或用图表显示)为什么您的解决方案在适当的范围内均匀地生成随机数?

The code should be generalizable, so that you can generate N uniform random numbers that sum to M (where M is a positive float). If possible, can you please also explain (or show with a plot) why your solution generates random numbers uniformly in the appropriate range?

未达标的相关问题:

生成多个随机数等于一个值python(当前接受的答案并不统一——另一个统一的答案仅适用于整数)

Generate multiple random numbers to equal a value in python (current accepted answer isn't uniform--another answer which is uniform only works with integers)

得到 N 个随机数的总和是 M(Java 中的相同问题,目前接受的答案完全是错误的,也没有统一分布的答案)

Getting N random numbers that the sum is M (same question in Java, currently accepted answer is just plain wrong, also no answers with uniform distribution)

在 R 中生成总和为 M 的 N 个随机整数(同样的问题,但在 R 中具有正常--不均匀--分布)

Generate N random integers that sum to M in R (same question, but in R with a normal--not uniform--distribution)

非常感谢任何帮助.

推荐答案

您所要求的似乎是不可能的.

What you are asking for seems to be impossible.

但是,我会重新解释您的问题,使其更有意义并且可以解决.您需要的是七维超平面 x_1 + x_2 + ... + x_8 = 0.5 上的概率分布.由于超平面的范围是无限的,因此在整个超平面上均匀分布是行不通的.您可能(?)想要的是所有 x_i>0 所在的超平面块.该区域是一个单纯形,是三角形的推广,单纯形上的均匀分布是狄利克雷分布的特例.

However, I will re-interpret your question so that it makes more sense and is possible to solve. What you need is a probability distribution on the seven-dimensional hyperplane x_1 + x_2 + ... + x_8 = 0.5. Since the hyperplane is infinite in extent, a uniform distribution on the whole hyperplane will not work. What you probably(?) want is the chunk of the hyperplane where all the x_i>0. That region is a simplex, a generalization of a triangle, and the uniform distribution on the simplex is a special case of the Dirichlet Distribution.

您可以找到 Dirichlet Distribution 维基百科文章的这一部分,字符串切割,特别是发光.

You may find this section of the Dirichlet Distribution Wikipedia article, string cutting, particularly illuminating.

实施

维基百科文章在随机数生成部分提供了以下 Python 实现:

The Wikipedia article gives the following implementation in Python in the Random Number Generation section:

params = [a1, a2, ..., ak]
sample = [random.gammavariate(a,1) for a in params]
sample = [v/sum(sample) for v in sample]

您可能(?)想要的是所有 ai=1 在单纯形上产生均匀分布的情况.这里 k 对应于您问题中的数字 N.要将样本总和为 M 而不是 1,只需将 sample 乘以 M.

What you probably(?) want is the case where all ai=1 which results in a uniform distribution on the simplex. Here k corresponds to the number N in your question. To get the samples to sum to M instead of 1, just multiply sample by M.

更新

感谢 Severin Pappadeux 指出 gammavariate 在极少数情况下可以返回无穷大.这在数学上是不可能的",但可以作为浮点数实现的人工产物发生.我处理这种情况的建议是在第一次计算 sample 后检查它;如果sample的任何一个分量为无穷大,则将所有非无穷分量设置为0,并将所有无穷大分量设置为1.然后当计算xi时,结果像 xi=1, all other x's=0, or xi=1/2, xj=1/2, all other x's=0 将导致,统称为角点样本"和边缘样本".

Thanks to Severin Pappadeux for pointing out that gammavariate can return infinity under rare circumstances. That is mathematically "impossible" but can occur as an artifact of the implementation in terms of floating point numbers. My suggestion for handling that case is to check for it after sample is first calculated; if any of the components of sample are infinity, set all the non-infinity components to 0 and set all the infinity components to 1. Then when the xi are calculated, outcomes like xi=1, all other x's=0, or xi=1/2, xj=1/2, all other x's=0 will result, collectively "corner samples" and "edge samples".

另一个极低概率的可能性是伽马变量的总和溢出.我猜我们可以运行整个底层伪随机数序列而不会看到这种情况发生,但理论上这是可能的(取决于底层伪随机数生成器).这种情况可以通过重新缩放 sample 来处理,例如,在计算 gammavariates 之后但在 x 之前,将 sample 的所有元素除以 N被计算.就我个人而言,我不会打扰,因为几率太低了.由于其他原因导致程序崩溃的概率更高.

Another very-low-probability possibility is for the sum of the gammavariates to overflow. I would guess that we could run through the entire underlying pseudo-random number sequence and not see that happen, but theoretically it is possible (depending on the underlying pseudorandom number generator). The situation could be handled by rescaling sample, e.g., dividing all elements of sample by N, after the gammavariates have been calculated but before the x's are calculated. Personally, I wouldn't bother because the odds are so low; program crashes due to other reasons would have higher probability.

这篇关于生成总和为 M 的 N 个统一随机数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆