固定子集大小的完美和问题 [英] Perfect sum problem with fixed subset size

查看:60
本文介绍了固定子集大小的完美和问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种最少时间复杂的算法,可以解决完美求和问题的变体(最初是:从大小为 n 的整数数组[*]中查找所有可变大小的子集组合总和为特定数字 x ),其中子集组合大小为固定大小 k ,并返回可能的组合,而无直接和间接(当存在包含完全相同的元素以另一个顺序重复).

我知道此问题是NP难题,因此我并不期望有一个完美的通用解决方案,但对于我来说,至少可以在合理的时间内运行, n 接近1000和 k 大约10

到目前为止我尝试过的事情:

  • 找到一个组合,然后对其进行连续修改及其修改

    假设我有一个数组,例如:

  s = [1,2,3,3,4,5,6,9] 

所以我有 n = 8 ,我想 x = 10 表示 k = 3

由于某种晦涩的方法(bruteforce?),我发现了一个子集 [3,3,4]

从这个子集中,我找到了其他可能的组合,方法是从中取出两个元素,然后用总和相同的其他元素替换它们,即(3,3)可以替换为(1,5),因为两者的总和相同,并且替换数字尚未使用.因此,我获得了另一个子集 [1,5,4] ,然后对所有获得的子集重复该过程……是无限期地吗?

这里建议的主要问题是,很难确定何时完成,而且这种方法比较混乱.我想像了这种方法的一些变体,但它们确实正在开发中

  • 遍历集合以列出所有总计为 x
  • k 个长组合

非常自我解释.这是一种幼稚的方法,在我的情况下效果不佳,因为我有一个很大的 n 和一个 k ,它的大小不足以避免灾难性的大量组合(组合数量的大小为10 ^ 27!)

我尝试了几种与设置研究领域相关的机制,而不是愚蠢地遍历所有可能性,但是它相当复杂并且仍在进行中

您有什么建议?(代码段可以使用任何语言,但我更喜欢C ++)

[*]为了清除有关基本集合是否可以包含重复项的疑问,我使用了术语数组".而不是设置"更准确地说.在我的情况下,集合可以包含重复的整数,并且可以包含很多重复的整数,例如,对于1000个元素(四舍五入)有70个不同的整数

解决方案

这是Python中的完整解决方案.到读者可以翻译成C ++.

与通常的子集总和一样,解的双向链接摘要的生成是伪多项式.它是 O(count_values * distinct_sums * depths_of_sums).但是,实际上遍历它们可能是指数级的.但是使用生成器的方式可以避免使用大量内存来生成该列表,即使运行起来可能需要很长时间.

从集合中

 导入namedtuple#这是一个双向链接列表.#(值,尾部)将是一组解决方案.(next_answer)是另一个.SumPath = namedtuple('SumPath','value tail next_answer')def fixed_sum_paths(数组,目标,计数):#首先查找值计数以处理重复项.value_repeats = {}对于数组中的值:如果value_repeats中的值:value_repeats [值] + = 1别的:value_repeats [值] = 1#path [depth] [x]将是所有尺寸总和的子集,总和为x.路径= [{}对于范围(count + 1)中的i]#首先,我们添加空集.路径[0] [0] = SumPath(值=无,尾巴=无,next_answer =无)#现在,我们开始向其中添加值.对于值,在value_repeats.items()中重复:#深度反转可避免看到使用此值可以找到的路径.对于反向深度(范围(len(路径))):为了获得结果,在paths [depth] .items()中的路径:对于范围(1,重复+1)中的i:如果计数<我+深度:#不要填得太深.休息结果+ =值如果导致路径[depth + i]:路径= SumPath(值=值,tail = path,next_answer = paths [depth + i] [结果])别的:路径= SumPath(值=值,tail = path,next_answer =无)路径[depth + i] [结果] =路径#细微的错误修复,价值,价值的路径#不应导致值,other_value,因为#我们已经首先插入了那个.路径= SumPath(值=值,tail = path.tail,next_answer =无)返回路径[计数] [目标]def path_iter(路径):如果path.value为None:#我们是尾巴屈服 []别的:而路径不是None时:值=路径.值对于path_iter(paths.tail)中的答案:answer.append(值)屈服答案路径= paths.next_answerdef fixed_sums(数组,目标,计数):路径= fixed_sum_paths(数组,目标,计数)返回path_iter(paths)对于fixed_sums([1,2,3,3,4,5,6,9],10,3)中的路径:打印(路径) 

顺便提一下您的示例,以下是解决方案:

  [1、3、6][1,4,5][2,3,5][3,3,4] 

I am looking for a least time-complex algorithm that would solve a variant of the perfect sum problem (initially: finding all variable size subset combinations from an array [*] of integers of size n that sum to a specific number x) where the subset combination size is of a fixed size k and return the possible combinations without direct and also indirect (when there's a combination containing the exact same elements from another in another order) duplicates.

I'm aware this problem is NP-hard, so I am not expecting a perfect general solution but something that could at least run in a reasonable time in my case, with n close to 1000 and k around 10

Things I have tried so far:

  • Finding a combination, then doing successive modifications on it and its modifications

    Let's assume I have an array such as:

s = [1,2,3,3,4,5,6,9]

So I have n = 8, and I'd like x = 10 for k = 3

I found thanks to some obscure method (bruteforce?) a subset [3,3,4]

From this subset I'm finding other possible combinations by taking two elements out of it and replacing them with other elements that sum the same, i.e. (3, 3) can be replaced by (1, 5) since both got the same sum and the replacing numbers are not already in use. So I obtain another subset [1,5,4], then I repeat the process for all the obtained subsets... indefinitely?

The main issue as suggested here is that it's hard to determine when it's done and this method is rather chaotic. I imagined some variants of this method but they really are work in progress

  • Iterating through the set to list all k long combinations that sum to x

Pretty self explanatory. This is a naive method that do not work well in my case since I have a pretty large n and a k that is not small enough to avoid a catastrophically big number of combinations (the magnitude of the number of combinations is 10^27!)

I experimented several mechanism related to setting an area of research instead of stupidly iterating through all possibilities, but it's rather complicated and still work in progress

What would you suggest? (Snippets can be in any language, but I prefer C++)

[*] To clear the doubt about whether or not the base collection can contain duplicates, I used the term "array" instead of "set" to be more precise. The collection can contain duplicate integers in my case and quite much, with 70 different integers for 1000 elements (counts rounded), for example

解决方案

Here is a complete solution in Python. Translation to C++ is left to the reader.

Like the usual subset sum, generation of the doubly linked summary of the solutions is pseudo-polynomial. It is O(count_values * distinct_sums * depths_of_sums). However actually iterating through them can be exponential. But using generators the way I did avoids using a lot of memory to generate that list, even if it can take a long time to run.

from collections import namedtuple
# This is a doubly linked list.
# (value, tail) will be one group of solutions.  (next_answer) is another.
SumPath = namedtuple('SumPath', 'value tail next_answer')

def fixed_sum_paths (array, target, count):
    # First find counts of values to handle duplications.
    value_repeats = {}
    for value in array:
        if value in value_repeats:
            value_repeats[value] += 1
        else:
            value_repeats[value] = 1

    # paths[depth][x] will be all subsets of size depth that sum to x.
    paths = [{} for i in range(count+1)]

    # First we add the empty set.
    paths[0][0] = SumPath(value=None, tail=None, next_answer=None)

    # Now we start adding values to it.
    for value, repeats in value_repeats.items():
        # Reversed depth avoids seeing paths we will find using this value.
        for depth in reversed(range(len(paths))):
            for result, path in paths[depth].items():
                for i in range(1, repeats+1):
                    if count < i + depth:
                        # Do not fill in too deep.
                        break
                    result += value
                    if result in paths[depth+i]:
                        path = SumPath(
                            value=value,
                            tail=path,
                            next_answer=paths[depth+i][result]
                            )
                    else:
                        path = SumPath(
                            value=value,
                            tail=path,
                            next_answer=None
                            )
                    paths[depth+i][result] = path

                    # Subtle bug fix, a path for value, value
                    # should not lead to value, other_value because
                    # we already inserted that first.
                    path = SumPath(
                        value=value,
                        tail=path.tail,
                        next_answer=None
                        )
    return paths[count][target]

def path_iter(paths):
    if paths.value is None:
        # We are the tail
        yield []
    else:
        while paths is not None:
            value = paths.value
            for answer in path_iter(paths.tail):
                answer.append(value)
                yield answer
            paths = paths.next_answer

def fixed_sums (array, target, count):
    paths = fixed_sum_paths(array, target, count)
    return path_iter(paths)

for path in fixed_sums([1,2,3,3,4,5,6,9], 10, 3):
    print(path)

Incidentally for your example, here are the solutions:

[1, 3, 6]
[1, 4, 5]
[2, 3, 5]
[3, 3, 4]

这篇关于固定子集大小的完美和问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆