生成多重集的幂集 [英] Generating the powerset of a multiset
问题描述
假设我有一个多重集
{a,a,a,b,c}
我可以从中做出以下选择:
from which I can make the following selections:
{}
{a}
{a,a}
{a,a,a}
{a,a,a,b}
{a,a,a,b,c}
{a,a,a,c}
{a,a,b}
{a,a,b,c}
{a,a,c}
{a,b}
{a,b,c}
{a,c}
{b}
{b,c}
{c}
请注意,选择的数量等于 16.多重集的幂集的基数 card(P(M))
,在 OEIS as
Notice that the number of selections equals 16. The cardinality of a powerset of a multiset, card(P(M))
, is defined on OEIS as
card(P(M)) = prod(mult(x) + 1) for all x in M
其中 mult(x)
是 m
中 x
的重数,prod
是条款.因此,对于我们的示例,这将等于 4 x 2 x 2 = 16.
where mult(x)
is the multiplicity of x
in M
and prod
is the product of the terms. So for our example, this would amount to 4 x 2 x 2 = 16.
例如,假设这些元素的多样性非常高:
Let's say, for example, that the multiplicity of these elements is very high:
m(a) = 21
m(b) = 36
m(c) = 44
然后
card(P(M)) = 22 * 37 * 45 = 36630.
但如果我们将所有这些元素视为不同的 - 作为一个集合 - 幂集的基数将是
But if we were to treat all those elements as distinct - as a set - the cardinality of the powerset would be
card(P(S)) = 2^(21+36+44) = 2535301200456458802993406410752.
标准"这个问题的解决方案建议只计算所有元素都被视为不同的集合的幂集,然后修剪结果以删除重复项.这是一个 O(2^n)
复杂度的解决方案.
The "standard" solution for this problem suggests to just compute the powerset of the set where all of the elements are treated as distinct, and then prune the results to remove the duplicates. That's a solution with O(2^n)
complexity.
是否存在生成复杂度为card(P(M))
阶的多重集幂集的通用算法?
Does a general algorithm for generating a powerset of a multiset with complexity on the order of card(P(M))
exist?
推荐答案
powerset
recipe with itertools
您所要求的通常称为 powerset
,可作为 itertools
配方以及 more_itertools
模块中的函数使用.请参阅文档:
powerset
recipe with itertools
What you are asking is usually called the powerset
and is available as an itertools
recipe, as well as a function in the module more_itertools
. See the documentation:
multiset = ['a', 'a', 'a', 'b', 'c']
#
# USING ITERTOOLS
#
import itertools
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return itertools.chain.from_iterable(itertools.combinations(s, r) for r in range(len(s)+1))
print(list(powerset(multiset)))
# [(), ('a',), ('a',), ('a',), ('b',), ('c',), ('a', 'a'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'b'), ('a', 'c'), ('b', 'c'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'a', 'a', 'b'), ('a', 'a', 'a', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'a', 'b', 'c')]
#
# USING MORE_ITERTOOLS
#
import more_itertools
print(list(more_itertools.powerset(multiset)))
# [(), ('a',), ('a',), ('a',), ('b',), ('c',), ('a', 'a'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'a'), ('a', 'b'), ('a', 'c'), ('a', 'b'), ('a', 'c'), ('b', 'c'), ('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'a', 'a', 'b'), ('a', 'a', 'a', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'b', 'c'), ('a', 'a', 'a', 'b', 'c')]
collections.Counter
对象的幂集
在 Python 中,多集通常用 collections.Counter
而不是 list
表示.collections.Counter
类是 dict
的子类;它实现了将元素映射到计数的字典,以及一些有用的方法,例如通过计算序列中出现的次数来构建 Counter
.
Powerset of a collections.Counter
object
In Python, multisets are usually represented with a collections.Counter
rather than with a list
. The class collections.Counter
is a subclass of dict
; it implements dictionaries that map elements to counts, as well as several useful methods such as building a Counter
by counting occurrences in a sequence.
获取 Counter
的幂集是关于 stackoverflow 的另一个问题的主题:
Taking the powerset of a Counter
is the topic of another question on stackoverflow:
虽然我不知道在标准模块中已经实现了这种方法,但该问题的答案提供了一种使用 itertools
的解决方案:
Although I am not aware of an already-implemented method doing this in standard modules, the answer to that question presents one solution using itertools
:
import collections
import itertools
multiset = collections.Counter(['a', 'a', 'a', 'b', 'c'])
# Counter({'a': 3, 'b': 1, 'c': 1})
def powerset(multiset):
range_items = [[(x, z) for z in range(y + 1)] for x,y in multiset.items()]
products = itertools.product(*range_items)
return [{k: v for k, v in pairs if v > 0} for pairs in products]
print(powerset(multiset))
# [{}, {'c': 1}, {'b': 1}, {'b': 1, 'c': 1}, {'a': 1}, {'a': 1, 'c': 1}, {'a': 1, 'b': 1}, {'a': 1, 'b': 1, 'c': 1}, {'a': 2}, {'a': 2, 'c': 1}, {'a': 2, 'b': 1}, {'a': 2, 'b': 1, 'c': 1}, {'a': 3}, {'a': 3, 'c': 1}, {'a': 3, 'b': 1}, {'a': 3, 'b': 1, 'c': 1}]
这篇关于生成多重集的幂集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!