在不构建和排序整个列表的情况下(即生成器)按产品顺序获取列表的每个可能子集的算法 [英] Algorithm to get every possible subset of a list, in order of their product, without building and sorting the entire list (i.e Generators)

查看:106
本文介绍了在不构建和排序整个列表的情况下(即生成器)按产品顺序获取列表的每个可能子集的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

实际上,我有一组具有概率的对象,我想查看它们的每个可能的组,以使它们全部为真,并假设它们是真的是独立的-即按子集元素乘积的降序排列-如果概率相同,则按长度顺序排列(因此(1,0.5)在(0.5)之后.

Practically, I've got a set of objects with probabilities, and I want to look at each possible group of them, in order of how likely it is that they're all true assuming they're independent -- i.e. in descending order of the product of the elements of the subsets -- or in order of length if the probabilities are the same (so that (1, 0.5) comes after (0.5)).

示例:如果我有[ 1, 0.5, 0.1 ],我要[ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1), (0.5, 0.1), (1, 0.5, 0.1) ]

Example: If I have [ 1, 0.5, 0.1 ] I want [ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1), (0.5, 0.1), (1, 0.5, 0.1) ]

从本质上讲,这意味着我想按顺序遍历一组元素的幂集,并且我可以很容易地生成它,对其进行排序并完成.但是,幂集很快变得相当大,我希望我通常会想要第一个子集,而我宁愿不生成成千上万个子集的列表,也不对它们进行排序,然后再也不要超越第三个子集.这就是python生成器希望节省一天的地方!

In essence, this means I want to iterate over the powerset of a set of elements in order, and I could fairly easily generate this, sort it, and be done. However, powersets get pretty big pretty fast, I expect I'm usually going to want one of the first subsets, and I'd rather not generate a list of thousands of subsets, sort them, and then never look past the third. This is where python generators hopefully save the day!

对该问题进行更正式的说明,我需要找到一种方法来作为生成器来做sorted(powerset(input), key = lambda l : reduce (lambda (p, n), e: (p * e, n-1), l, (1, 0)), reverse=True),或者以其他方式使我避免构建和排序整个列表.

More formal specification of the problem, I need to work out a way to do sorted(powerset(input), key = lambda l : reduce (lambda (p, n), e: (p * e, n-1), l, (1, 0)), reverse=True), as a generator, or in some other way that lets me avoid building and sorting the entire list.

我可以肯定地确定这与背包问题以及子集产品问题有关,但是我真的很难为它找到一个可行的好的算法,并且非常感谢帮助:-).它比在最坏的情况下(对整个过程进行迭代)对整个事物进行构建+排序要慢,这不是问题,它只需要更好的最佳情况(例如,在前10%以内)即可.

I'm reasonably sure this is related to the knapsack problem, along with the subset product problem, but I'm really struggling to get a nice algorithm for it that works, and help would be very much appreciated :-). It's not an issue for it to be slower than building + sorting the whole thing in the worst case (iterating all the way to the end), it just needs much better best case (within the first 10%, say) performance.

推荐答案

很好的问题,解决起来非常棘手.我也想不出一种按顺序生成组合的方法,但是我挥舞着强大的heapq(又名优先队列)来保持候选人的排序.

Nice question, it was quite tricky to solve. I can't think of a way to generate the combinations in order either, but I wield the mighty heapq (aka a priority queue) to keep the candidates sorted.

from heapq import heappush, heappop
import operator

def prob(ps):
    """ returns the probability that *not* all ps are True """
    return 1-reduce(operator.mul, ps)

def gen(ps):
    # turn each to a tuple
    items = ((x,) for x in sorted(ps, reverse=True))

    # create a priority queue, sorted by probability
    pq = [(prob(x),x) for x in items]

    # because you wanted this
    yield ()

    # as long as there are valid combinations
    while pq:
        # get the best un-yielded combination, the pq makes sure of that
        p, x = heappop(pq)
        yield x

        # generate all the combinations from this item
        for other in ps:

            # keeping the tuples sorted -> unique combinations
            if other < x[-1]:

                # create a new combination
                new = x+(other,)
                item = prob(new), new

                # add it to the queue
                heappush(pq,item)


a = [1, 0.1, 0.5] 
print list(gen(a))

这篇关于在不构建和排序整个列表的情况下(即生成器)按产品顺序获取列表的每个可能子集的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆