如何以子集长度为条件遍历列表的所有分区 [英] How to iterate through all partitions of a list with a condition on the subsets lenghts
问题描述
出于某些目的,我需要生成一个可迭代的对象,该对象列出列表的所有分区,但对子集的长度有条件. 也就是说,如果列表的长度不是3的倍数,我想将列表划分为等长的子集(此处等于3),最后一个除外.
即['a','b','c','d','e']应该为所有分区提供长度3和2的2个子集.
就是说,如果我只是简单地使用:
[p for p in multiset_partitions(['a','b','c','d','e'],2)]
Out:
[[['a', 'b', 'c', 'd'], ['e']],
[['a', 'b', 'c', 'e'], ['d']],
[['a', 'b', 'c'], ['d', 'e']],
.....
[['a', 'd'], ['b', 'c', 'e']],
[['a', 'e'], ['b', 'c', 'd']],
[['a'], ['b', 'c', 'd', 'e']]]
我全都明白了.因此,到目前为止,我最好的尝试是过滤出至少包含一个长度> 3子集的分区:
from sympy.utilities.iterables import multiset_partitions
def partitions(liste):
compte = 0
n = len(liste)//3 + 1
for p in multiset_partitions(liste,n):
l = len(p)
oversize = False
i = 0
while not(oversize) and i != l:
if len(p[i])>3:
oversize=True
i+=1
if oversize == False:
compte += 1
#do something with p
return(compte) #I'm just counting out the number of partitions right now
这可以解决问题,但显然不是实现我想要的最有效方法. 特别是当列表的长度增加时,分区的数量会迅速增加.
(长度为5时为10,但长度为10时为9100,而长度为13时为800800 ...)
最有效的pythonic方法应该是什么?
预先感谢
蒂埃里
您始终可以将filter
包裹在分区函数周围.您可以使用lambda
函数来确保除最后一个元素外,所有元素的长度均为3.
list(filter(lambda x: all(len(z)==3 for z in x[:-1]), multiset_partitions('abcde', 2)))
# returns:
[[['a', 'b', 'c'], ['d', 'e']],
[['a', 'b', 'd'], ['c', 'e']],
[['a', 'b', 'e'], ['c', 'd']],
[['a', 'c', 'd'], ['b', 'e']],
[['a', 'c', 'e'], ['b', 'd']],
[['a', 'd', 'e'], ['b', 'c']]]
选择分区数时必须小心,以确保使用ceil
.也就是说,对于10个项目,您想要ceil(10/3)
而不是10//3
.
For certain purposes, I need to generate an iterable that lists all the partitions of a list, but with a condition on the subsets lenghts. That is, I want to partition my list in subsets of equal lenght (=3 here), except the last one if the lenght of the list isn't a multiple of 3.
i.e. ['a','b','c','d','e'] should give all partitions with 2 subsets of lenght 3 and 2.
Namely, if I simply use :
[p for p in multiset_partitions(['a','b','c','d','e'],2)]
Out:
[[['a', 'b', 'c', 'd'], ['e']],
[['a', 'b', 'c', 'e'], ['d']],
[['a', 'b', 'c'], ['d', 'e']],
.....
[['a', 'd'], ['b', 'c', 'e']],
[['a', 'e'], ['b', 'c', 'd']],
[['a'], ['b', 'c', 'd', 'e']]]
I get them all. So my best try so far has been to filter out the partitions that contain at least one subset of lenght > 3 :
from sympy.utilities.iterables import multiset_partitions
def partitions(liste):
compte = 0
n = len(liste)//3 + 1
for p in multiset_partitions(liste,n):
l = len(p)
oversize = False
i = 0
while not(oversize) and i != l:
if len(p[i])>3:
oversize=True
i+=1
if oversize == False:
compte += 1
#do something with p
return(compte) #I'm just counting out the number of partitions right now
This does the trick, but is clearly not the most effective way to achieve what I want. Especially that the number of partitions becomes huge very quickly when the lenght of the list grows.
(10 for a length of 5, but 9100 for 10, 800800 for 13...)
What should be the most efficient pythonic way ?
Thanks in advance,
Thierry
You can always wrap filter
around the partitioning function. You can use a lambda
function to ensure all of the elements are of length 3 except the last one.
list(filter(lambda x: all(len(z)==3 for z in x[:-1]), multiset_partitions('abcde', 2)))
# returns:
[[['a', 'b', 'c'], ['d', 'e']],
[['a', 'b', 'd'], ['c', 'e']],
[['a', 'b', 'e'], ['c', 'd']],
[['a', 'c', 'd'], ['b', 'e']],
[['a', 'c', 'e'], ['b', 'd']],
[['a', 'd', 'e'], ['b', 'c']]]
You will have to be careful when selecting the number of partitions to ensure you are using ceil
. I.e for 10 items, you want ceil(10/3)
not 10//3
.
这篇关于如何以子集长度为条件遍历列表的所有分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!