如何生成随机数以满足python中的特定均值和中位数? [英] How to generate random numbers to satisfy a specific mean and median in python?

查看:64
本文介绍了如何生成随机数以满足python中的特定均值和中位数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想生成 n 个随机数,例如 n=200,其中可能值的范围在 2 到 40 之间,平均值为 12,中位数为 6.5.

I would like to generate n random numbers e.g., n=200, where the range of possible values is between 2 and 40 with a mean of 12 and median is 6.5.

我到处搜索,但找不到解决方案.我尝试了以下脚本,它适用于小数字,例如 20,对于大数字,它需要很长时间并返回结果.

I searched everywhere and i could not find a solution for this. I tried the following script by it works for small numbers such as 20, for big numbers it takes ages and result is returned.

n=200
x = np.random.randint(0,1,size=n) # initalisation only
while True:
        if x.mean() == 12 and np.median(x) == 6.5:
            break
        else:
            x=np.random.randint(2,40,size=n)

即使在 n=5000 左右的情况下,有人也可以通过改进它来帮助我快速获得结果吗?

Could anyone help me by improving this to get a quick result even when n=5000 or so?

推荐答案

获得真正接近您想要的结果的一种方法是生成两个单独的随机范围,长度为 100,满足您的中值约束并包括所有期望范围的数字.然后通过连接数组,平均值将在 12 左右,但不完全等于 12.但由于这只是意味着您正在处理,因此您可以通过调整这些数组之一来简单地生成预期结果.

One way to get a result really close to what you want is to generate two separate random ranges with length 100 that satisfies your median constraints and includes all the desire range of numbers. Then by concatenating the arrays the mean will be around 12 but not quite equal to 12. But since it's just mean that you're dealing with you can simply generate your expected result by tweaking one of these arrays.

In [162]: arr1 = np.random.randint(2, 7, 100)    
In [163]: arr2 = np.random.randint(7, 40, 100)

In [164]: np.mean(np.concatenate((arr1, arr2)))
Out[164]: 12.22

In [166]: np.median(np.concatenate((arr1, arr2)))
Out[166]: 6.5

以下是针对任何其他通过限制随机序列创建使用 for 循环或 python 级代码的解决方案的矢量化和非常优化的解决方案:

Following is a vectorized and very much optimized solution against any other solution that uses for loops or python-level code by constraining the random sequence creation:

import numpy as np
import math

def gen_random(): 
    arr1 = np.random.randint(2, 7, 99)
    arr2 = np.random.randint(7, 40, 99)
    mid = [6, 7]
    i = ((np.sum(arr1 + arr2) + 13) - (12 * 200)) / 40
    decm, intg = math.modf(i)
    args = np.argsort(arr2)
    arr2[args[-41:-1]] -= int(intg)
    arr2[args[-1]] -= int(np.round(decm * 40))
    return np.concatenate((arr1, mid, arr2))

演示:

arr = gen_random()
print(np.median(arr))
print(arr.mean())

6.5
12.0

函数背后的逻辑:

为了让我们有一个符合该标准的随机数组,我们可以将 3 个数组连接在一起 arr1midarr2.arr1arr2 各有 99 个项目,mid 有 2 个项目 6 和 7,因此最终结果为 6.5 作为中位数.现在我们创建两个长度为 99 的随机数组.要使结果具有 12 均值,我们需要做的就是找到当前总和与 12 * 200 之间的差,然后减去结果从我们的 N 个最大数字中选择,在这种情况下,我们可以从 arr2 中选择它们并使用 N=50.

In order for us to have a random array with that criteria we can concatenate 3 arrays together arr1, mid and arr2. arr1 and arr2 each hold 99 items and the mid holds 2 items 6 and 7 so that make the final result to give as 6.5 as the median. Now we an create two random arrays each with length 99. All we need to do to make the result to have a 12 mean is to find the difference between the current sum and 12 * 200 and subtract the result from our N largest numbers which in this case we can choose them from arr2 and use N=50.

如果在结果中包含浮点数没有问题,您实际上可以将函数缩短如下:

If it's not a problem to have float numbers in your result you can actually shorten the function as following:

import numpy as np
import math

def gen_random(): 
    arr1 = np.random.randint(2, 7, 99).astype(np.float)
    arr2 = np.random.randint(7, 40, 99).astype(np.float)
    mid = [6, 7]
    i = ((np.sum(arr1 + arr2) + 13) - (12 * 200)) / 40
    args = np.argsort(arr2)
    arr2[args[-40:]] -= i
    return np.concatenate((arr1, mid, arr2))

这篇关于如何生成随机数以满足python中的特定均值和中位数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆