如何编写最大集打包算法? [英] How to code the maximum set packing algorithm?

查看:12
本文介绍了如何编写最大集打包算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个有限集合 S 和一个 S 的子集列表.然后,集合打包问题询问列表中的一些 k 子集是否成对不相交.该问题的优化版本,最大集打包,要求列表中成对不相交集的最大数量.

Suppose we have a finite set S and a list of subsets of S. Then, the set packing problem asks if some k subsets in the list are pairwise disjoint . The optimization version of the problem, maximum set packing, asks for the maximum number of pairwise disjoint sets in the list.

http://en.wikipedia.org/wiki/Set_packing

所以,让 S = {1,2,3,4,5,6,7,8,9,10}

and `Sa = {1,2,3,4}`
and `Sb = {4,5,6}`
and `Sc = {5,6,7,8}`
and `Sd = {9,10}`

那么成对不相交集的最大数量为3(Sa, Sc, Sd)

Then the maximum number of pairwise disjoint sets are 3 ( Sa, Sc, Sd )

我找不到任何有关所涉及算法的文章.你能解释一下吗?

I could not find any articles about the algorithm involved. Can you shed some light on the same?

我的做法:

根据大小对集合进行排序.从最小尺寸的集合开始.如果下一个集合中没有元素与当前集合相交,那么我们将集合合并并增加最大集合的计数.这听起来不错吗?有更好的想法吗?

Sort the sets according to the size. Start from the set of the smallest size. If no element of the next set intersects with the current set, then we unite the set and increase the count of maximum sets. Does this sound good to you? Any better ideas?

推荐答案

正如 hivert 指出的,这个问题是 NP-hard,所以没有有效的方法来解决这个问题.但是,如果您的输入相对较小,您仍然可以将其关闭.毕竟,指数并不意味着不可能.只是随着输入大小的增长,指数问题很快变得不切实际.但是对于像 25 组这样的东西,你可以很容易地暴力破解它.

As hivert pointed out, this problem is NP-hard, so there's no efficient way to do this. However, if your input is relatively small, you can still pull it off. Exponential doesn't mean impossible, after all. It's just that exponential problems become impractical very quickly, as the input size grows. But for something like 25 sets, you can easily brute force it.

这是一种方法.假设您有 n 个子集,称为 S0S1、...等.我们可以尝试每个子集组合,然后选择一个具有最大基数的.只有 2^25 = 33554432 个选择,所以这可能足够合理.

Here's one approach. Let's say you have n subsets, called S0, S1, ..., etc. We can try every combination of subsets, and pick the one with maximum cardinality. There are only 2^25 = 33554432 choices, so this is probably reasonable enough.

做到这一点的一个简单方法是注意任何严格低于 2^N 的非负数都表示子集的特定选择.查看数字的二进制表示,并选择其索引对应于打开的位的集合.因此,如果数字为 11,则第 0、第 1 和第 3 位为 on,这对应于组合 [S0, S1, S3].然后您只需验证这三个集合实际上是不相交的.

An easy way to do this is to notice that any non-negative number strictly below 2^N represents a particular choice of subsets. Look at the binary representation of the number, and choose the sets whose indices correspond to the bits that are on. So if the number is 11, the 0th, 1st and 3rd bits are on, and this corresponds to the combination [S0, S1, S3]. Then you just verify that these three sets are in fact disjoint.

您的程序如下:

  1. 从 0 迭代 i 到 2^N - 1
  2. 对于 i 的每个值,使用打开的位来找出相应的子集组合.
  3. 如果这些子集成对不相交,请使用此组合更新您的最佳答案(即,如果它大于您当前的最佳答案,则使用此组合).

或者,使用回溯来生成您的子集.这两种方法是等效的,模实现权衡.回溯会产生一些堆栈开销,但如果您在进行过程中检查不相交性,则可能会切断整条计算线.例如,如果 S1 和 S2 不是不相交的,那么它将永远不会打扰包含这两个的任何更大的组合,从而节省一些时间.迭代法无法通过这种方式进行自身优化,但由于位运算和紧密循环,快速高效.

Alternatively, use backtracking to generate your subsets. The two approaches are equivalent, modulo implementation tradeoffs. Backtracking will have some stack overhead, but can cut off entire lines of computation if you check disjointness as you go. For example, if S1 and S2 are not disjoint, then it will never bother with any bigger combinations containing those two, saving some time. The iterative method can't optimize itself in this way, but is fast and efficient because of the bitwise operations and tight loop.

这里唯一重要的是如何检查子集是否成对不相交.您也可以在这里使用各种技巧,具体取决于约束条件.

The only nontrivial matter here is how to check if the subsets are pairwise disjoint. There are all sorts of tricks you can pull here as well, depending on the constraints.

一种简单的方法是从一个空集合结构开始(从您选择的语言中选择任何您想要的),然后从每个子集中逐个添加元素.如果你碰到了一个已经在集合中的元素,那么它至少出现在两个子集中,你可以放弃这个组合.

A simple approach is to start with an empty set structure (pick whatever you want from the language of your choice) and add elements from each subset one by one. If you ever hit an element that's already in the set, then it occurs in at least two subsets, and you can give up on this combination.

如果原集合Sm个元素,而m比较小,可以将它们每一个映射到范围[0, m-1] 并为每个集合使用位掩码.所以如果m <= 64,你可以使用Java long 来表示每个子集.打开与子集中元素对应的所有位.由于按位运算的速度,这允许极快的设置操作.按位与对应集合交集,按位或是并集.您可以通过查看交集是否为空来检查两个子集是否不相交(即,对两个位掩码进行 AND 运算会得到 0).

If the original set S has m elements, and m is relatively small, you can map each of them to the range [0, m-1] and use bitmasks for each set. So if m <= 64, you can use a Java long to represent each subset. Turn on all the bits that correspond to the elements in the subset. This allows blazing fast set operation, because of the speed of bitwise operations. Bitwise AND corresponds to set intersection, and bitwise OR is a union. You can check if two subsets are disjoint by seeing if the intersection is empty (i.e., ANDing the two bitmasks gives you 0).

如果你没有这么少的元素,你仍然可以避免多次重复设置的交集.你的集合很少,所以在开始时预先计算哪些是不相交的.您可以只存储一个布尔矩阵 D,使得 D[i][j] = true 如果 i 和 j 不相交.然后,您只需查找组合中的所有对来验证成对的不相交性,而不是进行真正的集合操作.

If you don't have so few elements, you can still avoid repeating the set intersections multiple times. You have very few sets, so precompute which ones are disjoint at the start. You can just store a boolean matrix D, such that D[i][j] = true iff i and j are disjoint. Then you just look up all pairs in a combination to verify pairwise disjointness, rather than doing real set operations.

这篇关于如何编写最大集打包算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆