如何创建一个总和为x的随机整数向量列表 [英] How to create a list of random integer vector whose sum is x

查看:83
本文介绍了如何创建一个总和为x的随机整数向量列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

创建一个总和为X(例如X = 1000)的随机向量非常简单:

Creating a random vector whose sum is X (e.g. X=1000) is fairly straight forward:

import random
def RunFloat():
    Scalar = 1000
    VectorSize = 30
    RandomVector = [random.random() for i in range(VectorSize)]
    RandomVectorSum = sum(RandomVector)
    RandomVector = [Scalar*i/RandomVectorSum for i in RandomVector]
    return RandomVector
RunFloat()

上面的代码创建了一个向量,其值是浮点型,总和为1000.

The code above create a vector whose values are floats and sum is 1000.

我很难创建一个简单的函数来创建一个向量,该向量的值为整数且总和为X(例如X = 1000 * 30)

I'm having difficulty creating a simple function for creating a vector whose values are integers and sum is X (e.g. X=1000*30)

import random
def RunInt():
    LowerBound = 600
    UpperBound = 1200
    VectorSize = 30
    RandomVector = [random.randint(LowerBound,UpperBound) for i in range(VectorSize)]
    RandomVectorSum = 1000*30
    #Sanity check that our RandomVectorSum is sensible/feasible
    if LowerBound*VectorSize <= RandomVectorSum and RandomVectorSum <= UpperBound*VectorSum:
        if sum(RandomVector) == RandomVectorSum:
            return RandomVector
        else:
            RunInt()  

有人对这个想法有什么建议吗?我的代码可能永远不会完成或遇到递归深度问题.

Does anyone have any suggestions to improve on this idea? My code might never finish or run into recursion depth problems.

感谢Oliver,mgilson和Dougal的投入.我的解决方案如下所示.

Thanks to Oliver, mgilson, and Dougal for their inputs. My solution is shown below.

  1. Oliver在多项式分配思想上很有创造力
  2. 简单地说,(1)很有可能比其他解决方案输出更多的解决方案. Dougal通过简单的大数定律测试/反例证明了多项式解空间分布不是均匀或正态的. Dougal还建议使用numpy的多项式函数,这样可以避免很多麻烦,痛苦和头痛.
  3. 为了克服(2)的输出问题,我使用RunFloat()来使显示的内容(我没有对其进行测试,因此它只是表面的外观)更均匀地分布.与(1)相比,这有什么不同?我真的不懂副手.足够供我使用.
  4. 再次感谢mgilson不使用numpy的替代方法.
  1. Oliver was very creative with the multinomial distribution idea
  2. Put simply, (1) is very likely to output certain solutions more so than others. Dougal demonstrated that the multinomial solution space distribution is not uniform or normal by a simple test/counter example of Law of Large Numbers. Dougal also suggested to use numpy's multinomial function which saves me a lot of trouble, pain, and headaches.
  3. To overcome (2)'s output issue, I use RunFloat() to give what appears (I haven't tested this so its just a superficial appearance) to be a more uniform distribution. How much of a difference does this make compared to (1)? I don't really know off-hand. It's good enough for my use though.
  4. Thanks again to mgilson for the alternative method that does not use numpy.

这是我为此编辑编写的代码:

Here is the code that I have made for this edit:

我意识到正态分布没有正确实现,因此我将其修改为以下内容:

I realized that the normal distribution is not correctly implemented, I have since modified it to the following:

import random
def RandFloats(Size):
    Scalar = 1.0
    VectorSize = Size
    RandomVector = [random.random() for i in range(VectorSize)]
    RandomVectorSum = sum(RandomVector)
    RandomVector = [Scalar*i/RandomVectorSum for i in RandomVector]
    return RandomVector

from numpy.random import multinomial
import math
def RandIntVec(ListSize, ListSumValue, Distribution='Normal'):
    """
    Inputs:
    ListSize = the size of the list to return
    ListSumValue = The sum of list values
    Distribution = can be 'uniform' for uniform distribution, 'normal' for a normal distribution ~ N(0,1) with +/- 5 sigma  (default), or a list of size 'ListSize' or 'ListSize - 1' for an empirical (arbitrary) distribution. Probabilities of each of the p different outcomes. These should sum to 1 (however, the last element is always assumed to account for the remaining probability, as long as sum(pvals[:-1]) <= 1).  
    Output:
    A list of random integers of length 'ListSize' whose sum is 'ListSumValue'.
    """
    if type(Distribution) == list:
        DistributionSize = len(Distribution)
        if ListSize == DistributionSize or (ListSize-1) == DistributionSize:
            Values = multinomial(ListSumValue,Distribution,size=1)
            OutputValue = Values[0]
    elif Distribution.lower() == 'uniform': #I do not recommend this!!!! I see that it is not as random (at least on my computer) as I had hoped
        UniformDistro = [1/ListSize for i in range(ListSize)]
        Values = multinomial(ListSumValue,UniformDistro,size=1)
        OutputValue = Values[0]
    elif Distribution.lower() == 'normal':
        """
        Normal Distribution Construction....It's very flexible and hideous
        Assume a +-3 sigma range.  Warning, this may or may not be a suitable range for your implementation!
        If one wishes to explore a different range, then changes the LowSigma and HighSigma values
        """
        LowSigma    = -3#-3 sigma
        HighSigma   = 3#+3 sigma
        StepSize    = 1/(float(ListSize) - 1)
        ZValues     = [(LowSigma * (1-i*StepSize) +(i*StepSize)*HighSigma) for i in range(int(ListSize))]
        #Construction parameters for N(Mean,Variance) - Default is N(0,1)
        Mean        = 0
        Var         = 1
        #NormalDistro= [self.NormalDistributionFunction(Mean, Var, x) for x in ZValues]
        NormalDistro= list()
        for i in range(len(ZValues)):
            if i==0:
                ERFCVAL = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
                NormalDistro.append(ERFCVAL)
            elif i ==  len(ZValues) - 1:
                ERFCVAL = NormalDistro[0]
                NormalDistro.append(ERFCVAL)
            else:
                ERFCVAL1 = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
                ERFCVAL2 = 0.5 * math.erfc(-ZValues[i-1]/math.sqrt(2))
                ERFCVAL = ERFCVAL1 - ERFCVAL2
                NormalDistro.append(ERFCVAL)  
            #print "Normal Distribution sum = %f"%sum(NormalDistro)
            Values = multinomial(ListSumValue,NormalDistro,size=1)
            OutputValue = Values[0]
        else:
            raise ValueError ('Cannot create desired vector')
        return OutputValue
    else:
        raise ValueError ('Cannot create desired vector')
    return OutputValue
#Some Examples        
ListSize = 4
ListSumValue = 12
for i in range(100):
    print RandIntVec(ListSize, ListSumValue,Distribution=RandFloats(ListSize))

以上代码可在 github 上找到.这是我为学校开设的课程的一部分. user1149913,也对问题进行了很好的解释.

The code above can be found on github. It is part of a class I built for school. user1149913, also posted a nice explanation of the problem.

推荐答案

我建议不要递归执行此操作:

I would suggest not doing this recursively:

当递归采样时,第一个索引中的值可能具有更大的范围,而后续索引中的值将受到第一个值的约束.这将产生类似于指数分布的内容.

When you sample recursively, the value from the first index has a much greater possible range, whereas values in subsequent indices will be constrained by the first value. This will yield something resembling an exponential distribution.

相反,我建议您从多项式分布中进行采样.这将平等地对待每个索引,限制总和,将所有值强制为整数,并从遵循这些规则的所有可能配置中均匀采样(注意:可以以多种方式发生的配置将根据其发生方式的数量进行加权).

Instead, what I'd recommend is sampling from the multinomial distribution. This will treat each index equally, constrain the sum, force all values to be integers, and sample uniformly from all possible configurations that follow these rules (note: configurations that can happen multiple ways will be weighted by the number of ways that they can occur).

为帮助您用多项式表示法合并问题,总和为n(整数),因此,每个k值(每个索引一个,也为整数)必须在0到n之间.然后按照食谱此处.

To help merge your question with the multinomial notation, total sum is n (an integer), and so each of the k values (one for each index, also integers) must be between 0 and n. Then follow the recipe here.

(或使用 numpy.random.multinomial 如@Dougal的建议).

(Or use numpy.random.multinomial as @Dougal helpfully suggested).

这篇关于如何创建一个总和为x的随机整数向量列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆