如何改进生成多集组合的算法? [英] How can I improve my algorithm for generating combinations of a multiset?

查看:21
本文介绍了如何改进生成多集组合的算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在以下生成有界多重集组合的生成器中优化 next()hasNext() 方法?(我将此发布到 C++ 和 Java,因为代码与 C++ 兼容,并且没有不直接转换为 C++ 的特定于 Java 的元素.

How can I optimize the next() and hasNext() methods in the following generator which produces combinations of a bounded multiset? (I posted this to C++ as well as Java because the code is C++-compatible and has no Java-specific elements that do not convert directly to C++.

算法中存在问题的特定区域是整个 hasNext() 方法,这可能会不必要地复杂化,还有一行:

The specific areas of the algorithm which are problematic are the entire hasNext() method which may be unnecessarily complicated, and the line:

if( current[xSlot] > 0 ) aiItemsUsed[current[xSlot]]--;

它有一个 if 语句,我认为可以以某种方式删除.我有一个早期版本的算法,它在 return 语句之前有一些回溯,因此有一个更简单的 hasNext() 测试,但我无法让那个版本工作.

which has an if-statement I think could be removed somehow. I had an earlier version of the algorithm which had some of the backtracking before the return statement and consequently had a much simpler hasNext() test, but I could not get that version to work.

这个算法的背景是很难找到.例如,在 Knuth 7.2.1.3 中,他只是说可以做到(并给出一个练习来证明该算法是可能的),但没有给出算法.同样,我有六本关于组合学的高级文本(包括 Papadimitriou 和 Kreher/Stimson),但它们都没有给出生成多组组合的算法.Kreher 将其作为读者练习".无论如何,如果您可以改进上述算法或提供比我的更有效的工作实现的参考,我将不胜感激.请只给出迭代算法(请不要递归).

The background of this algorithm is that it is very difficult to find. For example, in Knuth 7.2.1.3 he merely says that it can be done (and gives an exercise to prove that the algorithm is possible), but does not give an algorithm. Likewise, I have half a dozen advanced texts on combinatorics (including Papadimitriou and Kreher/Stimson) and none of them give an algorithm for generating combinations of a multiset. Kreher leaves it as an "exercise for the reader". Anyway, if you can improve the algorithm as above or provide a reference to a working implementation more efficient than mine I would appreciate it. Please give only iterative algorithms (no recursion, please).

/** The iterator returns a 1-based array of integers. When the last combination is reached hasNext() will be false.
  * @param aiItems One-based array containing number of items available for each unique item type where aiItems[0] is the number of item types
  * @param ctSlots  The number of slots into which the items go
  * @return The iterator which generates the 1-based array containing the combinations or null in the event of an error.
  */
public static java.util.Iterator<int[]> combination( final int[] aiItems, final int ctSlots ){ // multiset combination into a limited number of slots
    CombinatoricIterator<int[]> iterator = new CombinatoricIterator<int[]>(){
        int xSlot;
        int xItemType;
        int ctItemType;
        int[] current = new int[ctSlots + 1];
        int[] aiItemsUsed = new int[aiItems[0] + 1];
        { reset(); current[0] = ctSlots; ctItemType = aiItems[0]; }
        public boolean hasNext(){
            int xUseSlot = ctSlots;
            int iCurrentType = ctItemType;
            int ctItemsUsed = 0;
            int ctTotalItemsUsed = 0;
            while( true ){
                int xUsedType = current[xUseSlot];
                if( xUsedType != iCurrentType ) return true;
                ctItemsUsed++;
                ctTotalItemsUsed++;
                if( ctTotalItemsUsed == ctSlots ) return false;
                if( ctItemsUsed == aiItems[xUsedType] ){
                    iCurrentType--;
                    ctItemsUsed = 0;
                }
                xUseSlot--;
            }
        }
        public int[] next(){
            while( true ){
                while( xItemType == ctItemType ){
                    xSlot--;
                    xItemType = current[xSlot];
                }
                xItemType++;
                while( true ){
                    while( aiItemsUsed[xItemType] == aiItems[xItemType] && xItemType != current[xSlot] ){
                        while( xItemType == ctItemType ){
                            xSlot--;
                            xItemType = current[xSlot];
                        }
                        xItemType++;
                    }
                    if( current[xSlot] > 0 ) aiItemsUsed[current[xSlot]]--;
                    current[xSlot] = xItemType;
                    aiItemsUsed[xItemType]++;
                    if( xSlot == ctSlots ){
                        return current;
                    }
                    xSlot++;
                }
            }

        }
        public int[] get(){ return current; }
        public void remove(){}
        public void set( int[] current ){ this.current = current; }
        public void setValues( int[] current ){
            if( this.current == null || this.current.length != current.length ) this.current = new int[current.length];
            System.arraycopy( current, 0, this.current, 0, current.length );
        }
        public void reset(){
            xSlot = 1;
            xItemType = 0;
            Arrays.fill( current, 0 ); current[0] = ctSlots;
            Arrays.fill( aiItemsUsed, 0 ); aiItemsUsed[0] = aiItems[0];
        }
    };
    return iterator;
}

附加信息

到目前为止,一些受访者似乎不明白集合和有界多重集之间的区别.有界多重集具有重复元素.例如 { a, a, b, b, b, c } 是一个有界多重集,在我的算法中它会被编码为 { 3, 2, 3, 1 } .请注意,前导3"是集合中项目类型(唯一项目)的数量.如果您提供算法,则以下测试应产生如下所示的输出.

Some of the respondents so far seem to not understand the difference between a set and a bounded multiset. A bounded multiset has repeating elements. For example { a, a, b, b, b, c } is a bounded multiset, which would be encoded as { 3, 2, 3, 1 } in my algorithm. Note that the leading "3" is the number of item types (unique items) in the set. If you supply an algorithm, then the following test should produce the output as shown below.

    private static void combination_multiset_test(){
        int[] aiItems = { 4, 3, 2, 1, 1 };
        int iSlots = 4;
        java.util.Iterator<int[]> iterator = combination( aiItems, iSlots );
        if( iterator == null ){
            System.out.println( "null" );
            System.exit( -1 );
        }
        int xCombination = 0;
        while( iterator.hasNext() ){
            xCombination++;
            int[] combination = iterator.next();
            if( combination == null ){
                System.out.println( "improper termination, no result" );
                System.exit( -1 );
            }
            System.out.println( xCombination + ": " + Arrays.toString( combination ) );
        }
        System.out.println( "complete" );
    }


1: [4, 1, 1, 1, 2]
2: [4, 1, 1, 1, 3]
3: [4, 1, 1, 1, 4]
4: [4, 1, 1, 2, 2]
5: [4, 1, 1, 2, 3]
6: [4, 1, 1, 2, 4]
7: [4, 1, 1, 3, 4]
8: [4, 1, 2, 2, 3]
9: [4, 1, 2, 2, 4]
10: [4, 1, 2, 3, 4]
11: [4, 2, 2, 3, 4]
complete

推荐答案

EDIT:根据澄清的问题调整答案

EDIT: done adjusting answer according to the clarified question

主要思想:同样,结果选择可以类似于自定义数字系统进行编码.可以增加一个计数器并将该计数器解释为一个选择.

Main idea: again, the resulting selection can be encoded similar to a custom numeral system. One could increment a counter and interpret that counter as a selection.

然而,由于选择的大小有一个额外的限制 == target.实现限制的一种天真的方法是只检查结果选择的大小并跳过不满足限制的选择.但这很慢.

However, since there is an additional restriction of the size of selection == target. A naive way to implement the restriction is to just check the size of resulting selection and skip ones that does not satisfy the restriction. But that is slow.

所以我所做的就是做一个更聪明的增量,跳转到直接选择大小合适的.

So all I did was to do a little cleverer increment that jump to the selection with correct size directly.

抱歉,代码是用 Python 编写的.但我以一种类似于 Java 迭代器接口的方式做到了这一点.输入 &输出格式为:

Sorry the code is in Python. But I did it in a way comparable to Java iterator interface. The input & output format is:

haves[i] := multiplicity of the i-th item in the collection
target := output collection must have this size

代码:

class Perm(object):
    def __init__(self,items,haves,target):
        assert sum(haves) >= target
        assert all(h > 0 for h in haves)
        self.items = items
        self.haves = haves
        self.target = target
        self.ans = None
        self.stop = False
    def __iter__(self):
        return self
    def reset(self):
        self.ans = [0]*len(self.haves)
        self.__fill(self.target)
        self.stop = False
    def __fill(self,n):
        """fill ans from LSB with n bits"""
        if n <= 0: return
        i = 0
        while n > self.haves[i]:
            assert self.ans[i] == 0
            self.ans[i] = self.haves[i]
            n -= self.haves[i]
            i += 1
        assert self.ans[i] == 0
        self.ans[i] = n
    def __inc(self):
        """increment from LSB, carry when 'target' or 'haves' constrain is broken"""
        # in fact, the 'target' constrain is always broken on the left most non-zero entry
        # find left most non-zero
        i = 0
        while self.ans[i] == 0:
            i += 1
        # set it to zero
        l = self.ans[i]
        self.ans[i] = 0
        # do increment answer, and carry
        while True:
            # increment to the next entry, if possible
            i += 1
            if i >= len(self.ans):
                self.stop = True
                raise StopIteration
            #
            if self.ans[i] == self.haves[i]:
                l += self.ans[i]
                self.ans[i] = 0
            else:
                l -= 1
                self.ans[i] += 1
                break
        return l
    def next(self):
        if self.stop:
            raise StopIteration
        elif self.ans is None:
            self.reset()
        else:
            l = self.__inc()
            self.__fill(l)
        return self.ans

请注意,items 参数并未真正使用.

Note that the items argument isn't really used.

__init__ 中的 assert 是为了明确我对输入的假设.

The assert in the __init__ is to make clear of my assumption about the input.

__fill 中的 assert 只是在 __fillself.ans 的一个方便的属性代码>被调用.

The assert in the __fill is to just show a convenient property of self.ans in the context that __fill is called.

这是一个很好的代码测试框架:

Here is a nice skeleton for testing the code:

test_cases = [([3,2,1], 3),
              ([3,2,1], 5),
              ([3,2,1], 6),
              ([4,3,2,1,1], 4),
              ([1,3,1,2,4], 4),
             ]

P = Perm(None,*test_cases[-1])
for p in P:
    print p
    #raw_input()

输入 ([1,3,1,2,4], 4) 的示例结果:

[1, 3, 0, 0, 0]
[1, 2, 1, 0, 0]
[0, 3, 1, 0, 0]
[1, 2, 0, 1, 0]
[0, 3, 0, 1, 0]
[1, 1, 1, 1, 0]
[0, 2, 1, 1, 0]
[1, 1, 0, 2, 0]
[0, 2, 0, 2, 0]
[1, 0, 1, 2, 0]
[0, 1, 1, 2, 0]
[1, 2, 0, 0, 1]
[0, 3, 0, 0, 1]
[1, 1, 1, 0, 1]
[0, 2, 1, 0, 1]
[1, 1, 0, 1, 1]
[0, 2, 0, 1, 1]
[1, 0, 1, 1, 1]
[0, 1, 1, 1, 1]
[1, 0, 0, 2, 1]
[0, 1, 0, 2, 1]
[0, 0, 1, 2, 1]
[1, 1, 0, 0, 2]
[0, 2, 0, 0, 2]
[1, 0, 1, 0, 2]
[0, 1, 1, 0, 2]
[1, 0, 0, 1, 2]
[0, 1, 0, 1, 2]
[0, 0, 1, 1, 2]
[0, 0, 0, 2, 2]
[1, 0, 0, 0, 3]
[0, 1, 0, 0, 3]
[0, 0, 1, 0, 3]
[0, 0, 0, 1, 3]
[0, 0, 0, 0, 4]

性能 每个 next() 调用都需要 O(h) 其中 h 是项目(haves 列表的大小).

Performance Each next() call takes O(h) where h is the number of types of items (size of haves list).

这篇关于如何改进生成多集组合的算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆