如何从java中的一组大小n迭代生成k个元素子集? [英] How to iteratively generate k elements subsets from a set of size n in java?

查看:139
本文介绍了如何从java中的一组大小n迭代生成k个元素子集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个难题,包括分析所有大小的k子集并找出哪一个是最优的。我写了一个解决方案,当子集的数量很少时可以工作,但是对于更大的问题,它会耗尽内存。现在我正在尝试将用python编写的迭代函数转换为java,以便我可以在创建时分析每个子集,并且只获得表示它是如何优化的值,而不是整个集合,这样我就不会用完记忆。以下是我到目前为止的情况,即使是非常小的问题也似乎没有完成:

I'm working on a puzzle that involves analyzing all size k subsets and figuring out which one is optimal. I wrote a solution that works when the number of subsets is small, but it runs out of memory for larger problems. Now I'm trying to translate an iterative function written in python to java so that I can analyze each subset as it's created and get only the value that represents how optimized it is and not the entire set so that I won't run out of memory. Here is what I have so far and it doesn't seem to finish even for very small problems:

public static LinkedList<LinkedList<Integer>> getSets(int k, LinkedList<Integer> set)
{
    int N = set.size();
    int maxsets = nCr(N, k);
    LinkedList<LinkedList<Integer>> toRet = new LinkedList<LinkedList<Integer>>();

    int remains, thresh;
    LinkedList<Integer> newset; 
    for (int i=0; i<maxsets; i++)
    {
        remains = k;
        newset = new LinkedList<Integer>();
        for (int val=1; val<=N; val++)
        {
            if (remains==0)
                break;
            thresh = nCr(N-val, remains-1);
            if (i < thresh)
            {
                newset.add(set.get(val-1));
                remains --;
            }
            else 
            {
                i -= thresh;
            }
        }
        toRet.add(newset);
    }

    return toRet;

}

任何人都可以帮我调试这个函数或建议另一个算法迭代生成大小为k的子集?

Can anybody help me debug this function or suggest another algorithm for iteratively generating size k subsets?

编辑:我终于让这个函数工作了,我不得不创建一个新的变量,与我做的相同,我和thresh比较因为python处理循环索引的方式不同。

I finally got this function working, I had to create a new variable that was the same as i to do the i and thresh comparison because python handles for loop indexes differently.

推荐答案

首先,如果你打算对列表进行随机访问,你应该选择一个列表实现有效支持它。来自LinkedList上的javadoc:

First, if you intend to do random access on a list, you should pick a list implementation that supports that efficiently. From the javadoc on LinkedList:


所有操作都可以预期双重链接的
列表。索引到列表中的操作将从
开始或结束遍历列表,以较接近指定索引为准。

All of the operations perform as could be expected for a doubly-linked list. Operations that index into the list will traverse the list from the beginning or the end, whichever is closer to the specified index.

ArrayList的空间效率更高,随机访问速度更快。实际上,由于你事先知道了长度,你甚至可以使用普通数组。

An ArrayList is both more space efficient and much faster for random access. Actually, since you know the length beforehand, you can even use a plain array.

算法:让我们从简单开始:你将如何生成大小为1的所有子集?可能是这样的:

To algorithms: Let's start simple: How would you generate all subsets of size 1? Probably like this:

for (int i = 0; i < set.length; i++) {
    int[] subset = {i};
    process(subset);
}

其中process是一个对集合执行某些操作的方法,例如检查是否它比目前处理的所有子集更好。

Where process is a method that does something with the set, such as checking whether it is "better" than all subsets processed so far.

现在,您如何扩展它以适用于大小为2的子集?大小为2的子集与大小为1的子集之间的关系是什么?那么,通过移除其最大元素,可以将大小为2的任何子集转换为大小为1的子集。换句话说,可以通过获取大小为1的子集并添加大于集合中所有其他元素的新元素来生成大小为2的每个子集。在代码中:

Now, how would you extend that to work for subsets of size 2? What is the relationship between subsets of size 2 and subsets of size 1? Well, any subset of size 2 can be turned into a subset of size 1 by removing its largest element. Put differently, each subset of size 2 can be generated by taking a subset of size 1 and adding a new element larger than all other elements in the set. In code:

processSubset(int[] set) {
    int subset = new int[2];
    for (int i = 0; i < set.length; i++) {
        subset[0] = set[i];
        processLargerSets(set, subset, i);
    }
}

void processLargerSets(int[] set, int[] subset, int i) {
    for (int j = i + 1; j < set.length; j++) {
        subset[1] = set[j];
        process(subset);
    }
}

对于任意大小为k的子集,请观察任何子集通过切割最大元素,可以将大小为k的大小变为大小为k-1的子集。也就是说,可以通过生成大小为k-1的所有子集来生成大小为k的所有子集,并且对于这些子集中的每一个,并且每个值大于子集中的最大值,将该值添加到该集合。在代码中:

For subsets of arbitrary size k, observe that any subset of size k can be turned into a subset of size k-1 by chopping of the largest element. That is, all subsets of size k can be generated by generating all subsets of size k - 1, and for each of these, and each value larger than the largest in the subset, add that value to the set. In code:

static void processSubsets(int[] set, int k) {
    int[] subset = new int[k];
    processLargerSubsets(set, subset, 0, 0);
}

static void processLargerSubsets(int[] set, int[] subset, int subsetSize, int nextIndex) {
    if (subsetSize == subset.length) {
        process(subset);
    } else {
        for (int j = nextIndex; j < set.length; j++) {
            subset[subsetSize] = set[j];
            processLargerSubsets(set, subset, subsetSize + 1, j + 1);
        }
    }
}

测试代码:

static void process(int[] subset) {
    System.out.println(Arrays.toString(subset));
}


public static void main(String[] args) throws Exception {
    int[] set = {1,2,3,4,5};
    processSubsets(set, 3);
}

但是在大​​集合上调用它之前,请记住子集的数量可以增长很快。

But before you invoke this on huge sets remember that the number of subsets can grow rather quickly.

这篇关于如何从java中的一组大小n迭代生成k个元素子集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆