产生的顺序的重复列表无论 [英] Generating a list of repetitions regardless of the order

查看:142
本文介绍了产生的顺序的重复列表无论的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要生成一个列表插槽联想指数组合。例如,(0,0,1)表示0和1属于相同的插槽,而2属于一个其他。 (0,1,1,1)表示1,2,3属于相同的插槽,而0是本身。在本例中,0和1是仅有的方式识别这些时隙,但不携带信息为我的用法。

因此​​,(0,0,0)是绝对等同于(1,1,1)的我的目的,而(0,0,1)等同于(1,1,0)

经典笛卡尔乘积会产生大量的这些重复我想的摆脱。

这是我获得与 itertools.product

 >>> LEN,SIZE =(3,1)
>>>列表(itertools.product(范围(SIZE + 1),则重复= LEN))
>>>
[(0,0,0),
(0,0,1),
(0,1,0),
(0,1,1),
(1,0,0),
(1,0,1),
(1,1,0),
(1,1,1)]
 

这是我想获得:

 >>> [(0,0,0),
(0,0,1),
(0,1,0),
(0,1,1)]
 

这是很容易的小名单,但我不太看如何与大集合做到这一点。你有一个建议?

如果还不清楚,请告诉我,让我可以澄清我的问题。谢谢!

修改:基于Sneftel的回答,这个功能似乎工作,但我不知道是否确实得到所有的结果:

 高清测试():
    对于p中的产物(范围(2),重复= 3):
        J = -1
        好= TRUE
        对于k的电话号码:
            如果K> j和(K-j)条> 1:
                好=假
            ELIF K>记者:
                Ĵ= K
        如果好:
            生成P
 

解决方案

我会通过了以下意见启动:

  1. 在每个组合的第一个元素必须为0。
  2. 第二个元素必须为0或1。
  3. 第三个元素必须是0,1或2,而那只能是2,如果第二个元素为1

这些意见建议如下算法:

 高清分配(N,M​​,使用= 0):
    生成的`N`项目分配到`M`区分
    桶,其中`used`桶已经被沿用至今。

        >>>列表(分配(3,1))
        [(0,0,0)]
        >>>列表(分配(3,2))
        [(0,0,0),(0,0,1),(0,1,0),(0,1,1)]
        >>>列表(分配(3,3))
        [(0,0,0),(0,0,1),(0,1,0),(0,1,1),(0,1,2)]

    
    如果n == 0:
        产量 ()
        返回
    AA =列表(分配(N  -  1,男,使用))
    对于第一个范围(使用):
        对于在AA:
            产量(第一,)+一
    如果使用< M:
        对于在分配(N  -  1,男,使用+ 1):
            收益率(用)+一
 

此处理你的使用情况(12个项目,5桶),在几秒钟:

 >>>从timeit进口timeit
>>> timeit(拉姆达:表(分配(12,5)),数= 1)
4.513746023178101
>>>总和(1 _在分配(12,5))
2079475
 

这是大大超过你的付出在你的答案(一个调用产品,然后删除无效的分配)结束的功能更快的是,如果它被修改处理(12,5)用例:

 >>> timeit(拉姆达:列表(试验(12,5)),数= 1)
540.693009853363
 

I want to generate combinations that associate indices in a list with "slots". For instance,(0, 0, 1) means that 0 and 1 belong to the same slot while 2 belongs to an other. (0, 1, 1, 1) means that 1, 2, 3 belong to the same slot while 0 is by itself. In this example, 0 and 1 are just ways of identifying these slots but do not carry information for my usage.

Consequently, (0, 0, 0) is absolutely identical to (1, 1, 1) for my purposes, and (0, 0, 1) is equivalent to (1, 1, 0).

The classical cartesian product generates a lot of these repetitions I'd like to get rid of.

This is what I obtain with itertools.product :

>>> LEN, SIZE = (3,1)
>>> list(itertools.product(range(SIZE+1), repeat=LEN))
>>>
[(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1),
(1, 0, 0),
(1, 0, 1),
(1, 1, 0),
(1, 1, 1)]

And this is what I'd like to get:

>>> [(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1)]

It is easy with small lists but I don't quite see how to do this with bigger sets. Do you have a suggestion?

If it's unclear, please tell me so that I can clarify my question. Thank you!

Edit: based on Sneftel's answer, this function seems to work, but I don't know if it actually yields all the results:

def test():
    for p in product(range(2), repeat=3):
        j=-1
        good = True
        for k in p:
            if k> j and (k-j) > 1:
                good = False
            elif k >j:
                j = k
        if good:
            yield p

解决方案

I would start by making the following observations:

  1. The first element of each combination must be 0.
  2. The second element must be 0 or 1.
  3. The third element must be 0, 1 or 2, but it can only be 2 if the second element was 1.

These observations suggest the following algorithm:

def assignments(n, m, used=0):
    """Generate assignments of `n` items to `m` indistinguishable
    buckets, where `used` buckets have been used so far.

        >>> list(assignments(3, 1))
        [(0, 0, 0)]
        >>> list(assignments(3, 2))
        [(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1)]
        >>> list(assignments(3, 3))
        [(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (0, 1, 2)]

    """
    if n == 0:
        yield ()
        return
    aa = list(assignments(n - 1, m, used))
    for first in range(used):
        for a in aa:
            yield (first,) + a
    if used < m:
        for a in assignments(n - 1, m, used + 1):
            yield (used,) + a

This handles your use case (12 items, 5 buckets) in a few seconds:

>>> from timeit import timeit
>>> timeit(lambda:list(assignments(12, 5)), number=1)
4.513746023178101
>>> sum(1 for _ in assignments(12, 5))
2079475

This is substantially faster than the function you give at the end of your answer (the one that calls product and then drops the invalid assignments) would be if it were modified to handle the (12, 5) use case:

>>> timeit(lambda:list(test(12, 5)), number=1)
540.693009853363

这篇关于产生的顺序的重复列表无论的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆