有效计数笛卡尔积中总和超过特定数的集合 [英] Efficiently count sets in a cartesian product that sum above a specific number

查看:78
本文介绍了有效计数笛卡尔积中总和超过特定数的集合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下适用的Python 3代码:

I have the below Python 3 code that works:

import itertools

loops = 10
results = [4, 2.75, 2.75, 1.5, 1.5, 1.5, 0]
threshold = loops * 2
cartesian_product = itertools.product(results, repeat=loops)

good, bad = 0, 0

for e in cartesian_product:
    if (sum(e) >= threshold):
        good += 1
    else:
        bad += 1

print('Ratio of good vs total is {0:.3f}%'.format(100 * good / (good + bad)))

如果将循环数增加到更大数量(> 15),则该程序将花费很长时间才能完成.

If I increase the loops to a larger number (>15), the program takes too long to finish.

是否有更有效的方法来计算比率?

Is there a more efficient way/algorithm to calculate the ratio?

推荐答案

这是一个解决方案.想法是计算您可以通过 n 循环获得的所有可能值总和,计算不同的可能总和,并将所有大于阈值的总和累加在一起.

Here is a solution. The idea is to calculate all possible sums of our values you can obtain with n loops, counting the different possible sums, and counting together all sums that are larger than the threshold.

然后,我们可以通过将我们的值加到先前的总和中来生成 n + 1 个循环的所有可能的总和.我们可以希望,不同的总和的数量不会增加太多,因为我们将相同的值相乘了很多次,并且重新组合了所有大于阈值的总和.

Then, we can generate all possible sums for n+1 loops by adding our values to the previous sums. We can hope that the number of different possible sums won't grow too large, as we add many times the same values, and we regroup all sums larger than the threshold.

from collections import Counter

def all_sums(values, threshold, previous_sums = None):
    """
    values must be sorted
    previous_sums is a Counter of previously obtained possible sums

    Returns a Counter of all possible sums of values and the previous sums
    """
    if not previous_sums:
        previous_sums = Counter({0:1})

    new = Counter()
    for existing_sum, ex_sum_count in sorted(previous_sums.items()):
        for index, val in enumerate(values):
            total = existing_sum + val
            if total < threshold:
                # With the current value, we have found ex_sum_count
                # ways to obtain that total
                new.update({total: ex_sum_count})
            else:
                # We don't need the exact sum, as anything we could
                # later add to it will be over the threshold.
                # We count them under the value = threshold
                # As 'values' is sorted, all subsequent values will also give 
                # a sum over the threshold
                values_left = len(values) - index
                new.update({threshold: values_left * ex_sum_count})
                break
    return new


def count_sums(values, threshold, repeat):
    """
    values must be sorted!

    Recursively calculates the possible sums of 'repeat' values,
    counting together all values over 'threshold'
    """
    if repeat == 1:
        return all_sums(values, threshold, previous_sums=None)
    else:
        return all_sums(values, threshold, previous_sums=count_sums(values, threshold, repeat=repeat-1))

让我们以您的示例为例:

Let's try it on your example:

loops = 10
results = [4, 2.75, 2.75, 1.5, 1.5, 1.5, 0]
threshold = loops * 2

values = sorted(results)

sums = count_sums(values, threshold, repeat=loops)
print(sums)
# Counter({20: 137401794, 19.75: 16737840, 18.25: 14016240, 18.5: 13034520, 19.5: 12904920,
# 17.0: 12349260, 15.75: 8573040, 17.25: 8048160, 15.5: 6509160, 16.75: 6395760, 14.25: 5171040,
# 18.0: 5037480, 14.5: 4461480, 16: 3739980, 18.75: 3283020, 19.25: 3220800, 13.0: 3061800, 
# 14.0: 2069550, 12.75: 1927800, 15.25: 1708560, 13.25: 1574640, 17.5: 1391670, 11.5: 1326780,
# 11.75: 1224720, 14.75: 1182660, 16.5: 1109640, 10.25: 612360, 17.75: 569520, 11.25: 453600, 
# 16.25: 444060, 12.5: 400680, 10.0: 374220, 12: 295365, 13.75: 265104, 10.5: 262440, 19.0: 229950,
# 13.5: 204390, 8.75: 204120, 15.0: 192609, 9.0: 153090, 8.5: 68040, 9.75: 65520, 7.5: 61236, 
# 7.25: 45360, 11.0: 44940, 12.25: 21840, 6.0: 17010, 7.0: 7560, 5.75: 6480, 8.25: 5280, 4.5: 3240,
# 9.5: 2520, 10.75: 720, 4.25: 540, 5.5: 450, 3.0: 405, 6.75: 180, 8: 45, 1.5: 30, 2.75: 20, 4: 10, 0: 1})
number_of_sums = len(results) ** loops
# 282475249
good = sums[threshold]
# 137401794
bad = number_of_sums - good
# 145073455

我为它计时,在我那台比较旧的机器上大约需要9毫秒.

I timed it, it takes about 9 ms on my rather old machine.

以及其他一些数据:10个不同的值,20个循环:

And with some other data: 10 different values, 20 loops:

loops = 20
results = [4, 2.75, 2.45, 1.5, 1.3, 0.73, 0.12, 1.4, 1.5, 0]
threshold = loops * 2
values = sorted(results)

sums = count_sums(values, threshold, repeat=loops)
number_of_sums = len(results) ** loops
good = sums[threshold]
bad = number_of_sums - good
print(good)
print(bad)
# 5440943363190360728
# 94559056636809639272

我在不到12秒的时间内得到了

which I obtain in less than 12 seconds.

这篇关于有效计数笛卡尔积中总和超过特定数的集合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆