在Python中随机生成特定长度的整数分区的算法? [英] An algorithm for randomly generating integer partitions of a particular length, in Python?

查看:179
本文介绍了在Python中随机生成特定长度的整数分区的算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用SAGE提供的 random_element()函数为给定整数( N )具有特定长度( S )。我正在尝试从所有分区的集合中为给定值 N S 生成无偏随机样本。 SAGE的函数可以快速返回N个随机分区(即 Partitions(N).random_element())。

I've been using the random_element() function provided by SAGE to generate random integer partitions for a given integer (N) that are a particular length (S). I'm trying to generate unbiased random samples from the set of all partitions for given values of N and S. SAGE's function quickly returns random partitions for N (i.e. Partitions(N).random_element()).

但是,添加 S (即 Partitions(N,length = S).random_element())。同样,筛选出长度为 S N 个随机分区的速度也非常慢。

However, it slows immensely when adding S (i.e. Partitions(N,length=S).random_element()). Likewise, filtering out random partitions of N that are of length S is incredibly slow.

但是,我希望这对某人有帮助,我发现在函数返回 N 与长度 S 不匹配,即共轭分区的长度通常为S。即:

However, and I hope this helps someone, I've found that in the case when the function returns a partition of N not matching the length S, that the conjugate partition is often of length S. That is:

S = 10
N = 100
part = list(Partitions(N).random_element())
    if len(part) != S:
        SAD = list(Partition(part).conjugate())
        if len(SAD) != S:
            continue

这会增加找到长度为 S 的分区的速率,并且似乎会产生无偏的样本(我已经检查了针对不同值 N S 的整个分区集的结果)。

This increases the rate at which partitions of length S are found and appears to produce unbiased samples (I've examined the results against entire sets of partitions for various values of N and S).

但是,我使用的是N(例如 10,000 )和S(例如 300 ),这甚至会使这种方法变得不可行。与SAGE的 random_element()函数相关的注释承认有很大的优化空间。因此,有没有一种方法可以更快地生成与给定值 N S ,也许是通过不生成与 S 不匹配的分区吗?此外,使用共轭分区在许多情况下都能很好地生成无偏样本,但我不能说我完全理解为什么。

However, I'm using values of N (e.g. 10,000) and S (e.g. 300) that make even this approach impractically slow. The comment associated with SAGE's random_element() function admits there is plenty of room for optimization. So, is there a way to more quickly generate unbiased (i.e. random uniform) samples of integer partitions matching given values of N and S, perhaps, by not generating partitions that do not match S? Additionally, using conjugate partitions works well in many cases to produce unbiased samples, but I can't say that I precisely understand why.

推荐答案

最后,我有一个绝对无偏的方法,其拒绝率为零。当然,我已经对其进行了测试,以确保结果是整个可行集的代表样本。速度非常快,完全没有偏见。

Finally, I have a definitively unbiased method that has a zero rejection rate. Of course, I've tested it to make sure the results are representative samples of entire feasible sets. It's very fast and totally unbiased. Enjoy.

from sage.all import *
import random

首先,该函数查找具有s个部分的n分区的最小最大加数

def min_max(n,s):

    _min = int(floor(float(n)/float(s)))
    if int(n%s) > 0:
        _min +=1

    return _min

接下来,该函数使用缓存和记忆来查找n的分区
的数量,其中s个部分以x为最大部分。这很快,但是我认为有一个更好的解决方案。例如,通常:P(N,S,max = K)= P(NK,S-1)
感谢ante( https://stackoverflow.com/users/494076/ante )为我提供帮助:
找到给定总数,整数部分和最大求和数的整数分区的数量

Next, A function that uses a cache and memoiziation to find the number of partitions of n with s parts having x as the largest part. This is fast, but I think there's a more elegant solution to be had. e.g., Often: P(N,S,max=K) = P(N-K,S-1) Thanks to ante (https://stackoverflow.com/users/494076/ante) for helping me with this: Finding the number of integer partitions given a total, a number of parts, and a maximum summand

D = {}
def P(n,s,x):
    if n > s*x or x <= 0: return 0
    if n == s*x: return 1
    if (n,s,x) not in D:
        D[(n,s,x)] = sum(P(n-i*x, s-i, x-1) for i in xrange(s))
    return D[(n,s,x)]

最后,该函数查找具有s部分的n的均匀随机分区,没有拒绝率!每个随机选择的数字均编码具有s部分的n的特定分区。

def random_partition(n,s):
    S = s
    partition = []
    _min = min_max(n,S)
    _max = n-S+1

    total = number_of_partitions(n,S)
    which = random.randrange(1,total+1) # random number

    while n:
        for k in range(_min,_max+1):
            count = P(n,S,k)
            if count >= which:
                count = P(n,S,k-1)
                break

        partition.append(k)
        n -= k
        if n == 0: break
        S -= 1
        which -= count
        _min = min_max(n,S)
        _max = k

    return partition

这篇关于在Python中随机生成特定长度的整数分区的算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆