具有固定子集大小的和子集 [英] Sum-subset with a fixed subset size

查看:24
本文介绍了具有固定子集大小的和子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

和子集问题指出:

<块引用>

给定一组整数,是否存在总和为零的非空子集?

这个问题一般是NP完全的.我很好奇是否知道这种轻微变体的复杂性:

<块引用>

给定一组整数,是否有一个大小为 k 的子集,其总和为零?

例如,如果k = 1,您可以进行二分查找以在O(log n)中找到答案.如果 k = 2,那么您可以将其简化为 O(n log n)(例如,请参阅 从数组中查找总和等于给定数字的一对元素).如果 k = 3,则您可以执行 O(n^2)(例如,请参阅 在数组中找出总和最接近给定数字的三个元素).<块引用>

是否有一个已知的界限可以作为 k 的函数来解决这个问题?

作为动机,我在考虑这个问题 如何将数组分成两部分,使两部分的平均值相等? 并尝试确定它是否实际上是 NP 完全的.答案在于是否有上述公式.

除非有通用解决方案,否则我很想知道 k=4 的最佳界限.

解决方案

对于k=4,空间复杂度O(n),时间复杂度O(n2 * log(n))

对数组进行排序.从2个最小和2个最大元素开始,按非递减顺序计算2个元素(a[i] + a[j])的所有lesser和,所有greater 2 个元素的总和 (a[k] + a[l]) 按非递增顺序.如果总和小于零,增加lesser sum,如果总和大于零,减少greater 1,当总和为零(成功)或a[i] + a[j] >a[k] + a[l](失败).

诀窍是以这样的方式遍历所有索引 ij,即 (a[i] + a[j]) 永远不会减少.而对于 kl(a[k] + a[l]) 永远不应该增加.优先队列有助于做到这一点:

  1. key=(a[i] + a[j]), value=(i = 0, j = 1)放入优先队列.
  2. 从优先队列中弹出(sum, i, j).
  3. 在上述算法中使用sum.
  4. (a[i+1] + a[j]), i+1, j(a[i] + a[j+1]), i,j+1 仅当这些元素尚未使用时才进入优先队列.要跟踪使用的元素,请为每个 'i' 维护一个最大使用的 'j' 数组.仅对 'j' 使用大于 'i' 的值就足够了.
  5. 从第 2 步继续.

对于 k>4

如果空间复杂度仅限于 O(n),我找不到比对 k-4 值使用蛮力和对剩余的 4 值使用上述算法更好的方法> 价值观.时间复杂度 O(n(k-2) * log(n)).

对于非常大的 k 整数线性规划一些改进.

更新

如果n非常大(与最大整数值的顺序相同),则可以实现O(1)优先级队列,将复杂度提高到O(n2) 和 O(n(k-2)).

如果 n >= k * INT_MAX,则可以使用空间复杂度为 O(n) 的不同算法.为 k/2 值的所有可能总和预先计算一个位集.并使用它来检查其他 k/2 值的总和.时间复杂度为 O(n(ceil(k/2))).

The sum-subset problem states:

Given a set of integers, is there a non-empty subset whose sum is zero?

This problem is NP-complete in general. I'm curious if the complexity of this slight variant is known:

Given a set of integers, is there a subset of size k whose sum is zero?

For example, if k = 1, you can do a binary search to find the answer in O(log n). If k = 2, then you can get it down to O(n log n) (e.g. see Find a pair of elements from an array whose sum equals a given number). If k = 3, then you can do O(n^2) (e.g. see Finding three elements in an array whose sum is closest to a given number).

Is there a known bound that can be placed on this problem as a function of k?

As motivation, I was thinking about this question How do you partition an array into 2 parts such that the two parts have equal average? and trying to determine if it is actually NP-complete. The answer lies in whether or not there is a formula as described above.

Barring a general solution, I'd be very interested in knowing an optimal bound for k=4.

解决方案

For k=4, space complexity O(n), time complexity O(n2 * log(n))

Sort the array. Starting from 2 smallest and 2 largest elements, calculate all lesser sums of 2 elements (a[i] + a[j]) in the non-decreasing order and all greater sums of 2 elements (a[k] + a[l]) in the non-increasing order. Increase lesser sum if total sum is less than zero, decrease greater one if total sum is greater than zero, stop when total sum is zero (success) or a[i] + a[j] > a[k] + a[l] (failure).

The trick is to iterate through all the indexes i and j in such a way, that (a[i] + a[j]) will never decrease. And for k and l, (a[k] + a[l]) should never increase. A priority queue helps to do this:

  1. Put key=(a[i] + a[j]), value=(i = 0, j = 1) to priority queue.
  2. Pop (sum, i, j) from priority queue.
  3. Use sum in the above algorithm.
  4. Put (a[i+1] + a[j]), i+1, j and (a[i] + a[j+1]), i, j+1 to priority queue only if these elements were not already used. To keep track of used elements, maintain an array of maximal used 'j' for each 'i'. It is enough to use only values for 'j', that are greater, than 'i'.
  5. Continue from step 2.

For k>4

If space complexity is limited to O(n), I cannot find anything better, than use brute force for k-4 values and the above algorithm for the remaining 4 values. Time complexity O(n(k-2) * log(n)).

For very large k integer linear programming may give some improvement.

Update

If n is very large (on the same order as maximum integer value), it is possible to implement O(1) priority queue, improving complexities to O(n2) and O(n(k-2)).

If n >= k * INT_MAX, different algorithm with O(n) space complexity is possible. Precalculate a bitset for all possible sums of k/2 values. And use it to check sums of other k/2 values. Time complexity is O(n(ceil(k/2))).

这篇关于具有固定子集大小的和子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆