Python-从长度不等的列表中获取替换的所有唯一组合 [英] Python - Get all unique combinations with replacement from lists of list with unequal length

查看:67
本文介绍了Python-从长度不等的列表中获取替换的所有唯一组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:这不是标题可能会重复的问题

Note : This is not a duplicate question as the title might say

如果我有一个list列表,则需要从列表中获取所有组合并进行替换.

If I have a list of list , I need to get all combinations from it with replacement.

import itertools

l = [[1,2,3] ,[1,2,3],  [1,2,3]]
n = []
for i in itertools.product(*l):
    if sorted(i) not in n:
        n.append(sorted(i))
for i in n:
    print(i)

[1, 1, 1]
[1, 1, 2]
[1, 1, 3]
[1, 2, 2]
[1, 2, 3]
[1, 3, 3]
[2, 2, 2]
[2, 2, 3]
[2, 3, 3]
[3, 3, 3]

感谢@RoadRunner和@Idlehands.

Thanks to @RoadRunner and @Idlehands.

上面的代码完美,但有两个问题:

Above code is perfect with 2 problems :

  1. 对于大型列表,itertools.product引发MemoryError.当l具有18个3个长度的子列表时,给出的合并数约为4亿.

  1. For large list, itertools.product throws MemoryError. When l has 18 3-length sublists to give ~400mil combn.

顺序很重要,因此 sorted 对我的问题不起作用.对于某些人来说这可能会造成混淆,因此请在下面的示例中进行解释.

Order matters and thus sorted would not work for my problem. This could be confusing for some and hence explaining with below example.

l = [[1,2,3],[1],[1,2,3]]

在这里,我有2个独特的组:

Here I have 2 unique groups :

Group1:元素0、2具有相同的[1,2,3]

Group1 : elements 0, 2 which has same value [1,2,3]

第2组:元素1,其值为[1]

Group 2 : element 1 which has value [1]

因此,我需要的解决方案是:

Thus, the solutions I need is :

[1,1,1]
[1,1,2]
[1,1,3]
[2,1,2]
[2,1,3]
[3,1,3]

因此位置 1 已固定为 1 .

希望这个示例有帮助.

推荐答案

修改后的答案:

基于新信息,为了处理过多的组合,使 itertools.product()超载,我们可以尝试分批提取列表:

Based on the new information, in order to handle a plethora of combination overloading the itertools.product(), we can try to pull the list in small batches:

from itertools import product
l = [list(range(3))]*18
prods = product(*l)
uniques = set()
results = []
totals = 0

def run_batch(n=1000000):
    for i in range(n):
        try:
            result = next(prods)
        except StopIteration:
            break
        unique = tuple(sorted(result))
        if unique not in uniques:
            uniques.add(unique)
            results.append(result)
    global totals
    totals += i

run_batch()
print('Total iteration this batch: {0}'.format(totals))
print('Number of unique tuples: {0}'.format(len(uniques)))
print('Number of wanted combos: {0}'.format(len(results)))

输出:

Total iteration this batch: 999999
Number of unique tuples: 103
Number of wanted combos: 103
First 10 results:
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2)

在这里,我们可以通过使用您选择的范围调用 next(prod)来控制批量大小,并在您认为合适的情况下继续操作. uniques 是一组作为参考点在集合中排序的元组,而 results 按您想要的正确顺序排列.当我运行3 ^ 18的列表时,两个大小应该相同,并且要小得令人惊讶.我对内存分配不是很熟悉,但是这样程序不应该将所有不需要的结果存储在内存中,因此您应该有更多的摆动空间.否则,您始终可以选择将结果导出到文件中以腾出空间.显然,此示例仅显示列表的长度,但是您可以根据自己的目的轻松显示/保存该列表.

Here we can control the batch size by calling next(prod) with the range of your choice, and continue as you see fit. The uniques are sorted tuples in a set as a reference point, and the results are in the proper order you wanted. Both size should be the same and are surprisingly small when I ran with the list of 3^18. I'm not well acquainted with memory allocation but this way the program shouldn't store all the unwanted results in memory, so you should therefore have more wiggle room. Otherwise, you can always opt to export the results to a file to make room. Obviously this sample only show the length of the list, but you can easily display/save that for your own purpose.

我不能说这是最好的方法或最优化的方法,但是它似乎对我有用.也许对您有用吗?该批次大约需要10秒钟才能运行5次(平均每批次2秒钟).整套 prods 花了我15分钟时间来运行:

I can't argue this is the best approach or most optimized, but It seems to work for me. Maybe it'll work for you? This batch took approximately ~10s to run 5 times (avg ~2s each batch). The entire set of prods took me 15 minutes to run:

Total iteration: 387420102
Number of unique tuples: 190
Number of wanted combos: 190


原始答案:

@RoadRunner具有 sort() defaultdict 的巧妙解决方案,但我觉得不需要后者.我利用了他的 sort()建议,并在此处实现了修改后的版本.

@RoadRunner had a neat solution with sort() and defaultdict, but I feel the latter was not needed. I leveraged his sort() suggestion and implemented a modified version here.

来自此答案:

l = [[1] ,[1,2,3],  [1,2,3]]
n = []
for i in itertools.product(*l):
    if sorted(i) not in n:
        n.append(sorted(i))
for i in n:
    print(i)

输出:

[1, 1, 1]
[1, 1, 2]
[1, 1, 3]
[1, 2, 2]
[1, 2, 3]
[1, 3, 3]

这篇关于Python-从长度不等的列表中获取替换的所有唯一组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆