创建两个列表中项目的所有可能组合的元组,而不必在元组中重复项目 [英] Creating tuples of all possible combinations of items from two lists, without duplicating items within tuples

查看:62
本文介绍了创建两个列表中项目的所有可能组合的元组,而不必在元组中重复项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够采用一定范围的数字,并返回包含三元组且无重复的列表。 x的每个元素应在三元组的每个位置出现一次。目标是获得如下所示的内容:

I would like to be able to take a range of numbers and return a list containing triples without duplicates. Each element of x should appear once in each position of the triples. The goal is to get something like the following:

get_combinations_without_duplicates(3) = [(0, 1, 2), (1, 2, 0), (2, 0, 1)]

对于range(3),这是只是列表轮换,但对于更大的范围,存在更多可能的组合。我希望能够随机生成满足这些约束的三元组列表。

For range(3) this is just a list rotation, but for higher ranges, there are more possible combinations. I would like to be able to randomly generate a list of triples that satisfies these constraints.

假设我们首先为n =的情况指定每个三元组的第一个元素4:

Suppose we start by specifying the first element of each triple for the case where n=4:

[(0,),(1,),(2,),(3,)]

[(0,), (1,), (2,), (3,)]

第一个三元组的第二个元素可以是除0之外的任何其他元素。一旦选择了其中一个,则将限制下一个三元组的选项,依此类推。我们的目标是拥有一个可以接受数字并以此方式创建三元组的函数,但并不总是创建相同的三元组。也就是说,最终结果可能是轮换:

The second element of the first triple can be anything other than 0. Once one of these is chosen, then this limits the options for the next triple, and so on. The goal is to have a function that takes a number and creates the triples in this manner, but doesn't always create the same set of triples. That is, the end result could be a rotation:

[(0, 1, 2), (1, 2, 3), (2, 3, 0), (3, 0, 1),]

[(0, 2, 3), (1, 3, 0), (2, 0, 1), (3, 1, 2)]

以下是此功能的实现:

def get_combinations_without_duplicates(n):
    output = []
    second = range(n)
     third = range(n)
for i in range(n):
    triple = [i]
    #Get the second value of the triple, but make sure that it isn't a 
    #duplicate of the first value
    #in the triple or any value that has appeared in the second position of any triple
    choices_for_second = [number for number in second if number not in triple]
    #Randomly select a number from the allowed possibilities
    n_second = random.choice(choices_for_second) 
    #Append it to the triple
    triple.append(n_second)
    #Remove that value from second so that it won't be chosen for other triples
    second = [number for number in second if number != n_second]
    #Do the same for the third value
    choices_for_third = [number for number in third if number not in triple]
    n_third = random.choice(choices_for_third)
    triple.append(n_third)
    third = [number for number in third if number != n_third]
    output.append(tuple(triple))
return output

如下所示,此过程有时会随机选择无效的组合。如果您执行以下操作,则可以解决该问题:

As pointed out below, this process will sometimes randomly select combinations that don't work. That can be handled if you do something like:

def keep_trying(n):
    try:
        return get_combinations_without_duplicates(n)
    except IndexError:
        return keep_trying(n)

但是,我想知道总体上是否有更好的方法。

However, I'm wondering if there is a better way to do this in general.

推荐答案

让我们再试一次。

一些观察。


  1. 在元组的排序数组中,第一个值始终为零。

  2. 数组的长度将始终与数组中存在的元组的数量一样长。

  3. 您希望它们是随机生成。

  4. 元组按排序顺序生成。

  1. The first value will always be zero in a sorted array of your tuples.
  2. The length of the array will always be as long as the number of tuples that exist in your array.
  3. You want these to be randomly generated.
  4. The tuples are produced in 'sorted' order.

根据这些规范,我们可以提出一种程序方法;

Based on these specifications, we can come up with a procedural method;


  1. 生成2个连续整数列表,一个从中选择,另一个从中进行种子。

  2. 对于种子列表中的每个数字, [0、1、2、3] ,随机追加并删除一个不是已经在元素中。 [01,13,20,32]

  3. 生成另一个序列整数列表,然后重复。 [012,130,203,321]

  1. Generate 2 lists of serial integers, one to pick from, the other to seed from.
  2. For each number in the seed list, [0, 1, 2, 3], randomly append and remove a number that's not already in the element. [01, 13, 20, 32]
  3. Generate another list of serial integers, and repeat. [012, 130, 203, 321]

但是,这不工作。对于某些迭代,它将自身退回一个角,并且无法生成数字。例如, [01,13,20,32] ..附加[3,0,1 ...废话,我被卡住了。

But, this doesn't work. For some iterations, it will back itself into a corner and not be able to generate a number. For instance, [01, 13, 20, 32].. appending [3, 0, 1... crap, I'm stuck.

解决此问题的唯一方法是对整个行进行 true 改组,然后重新进行改组,直到适合为止。这可能会花费很多时间,并且随着设置时间的延长,只会变得更加痛苦。

The only way to fix this, is to do a true shuffling over the entire row, and reshuffle until one fits. This may take quite a bit of time, and will only get more painful as the sets get longer.

因此,从程序上讲:

解决方案1:随机生成


  1. 使用您的范围填充列表。 [0,1,2,3]

  2. 创建另一个列表。 [0,1,2,3]

  3. 随机播放列表。 [1、0、2、3]

  4. 随机播放,直到找到合适的... [1、2、3、0]

  5. 重复第三个元素。

  1. Populate a list with your range. [0, 1, 2, 3]
  2. Create another list. [0, 1, 2, 3]
  3. Shuffle the list. [1, 0, 2, 3]
  4. Shuffle until you find one that fits... [1, 2, 3, 0]
  5. Repeat with the third element.

通过此过程,虽然计算机可以非常快速地验证解决方案,但是无法很快生成解决方案。但是,它只是生成真正随机答案的两种方法之一。

With this procedure, while a computer can verify solutions very quickly, it cannot generate solutions very quickly. However, it is merely one of two ways to generate a truly random answer.

因此,最快的保证方法将使用验证程序,而不是生成程序。首先,生成所有可能的排列。

Therefore, the fastest guaranteed method would use make use of a verification procedure, rather than a generating procedure. First things first, generate all the possible permutations.

from itertools import permutations

n = 4
candidates = [i for i in permutations(xrange(n),3)]

然后。

解决方案2:随机验证


  1. 选择一个以0开头的三元组。

  2. Pop随机出现一个不以0开头的三元组。

  3. 验证随机选取的三元组是否为中间解决方案。

  4. 如果不是,请弹出另一个三元组。

  5. 如果是,请附加三元组,然后重新弹出三重奏队列

  6. 重复n次。 #或直到耗尽队列为止,此时重复n次自然会变成TRUE

  1. Pick a triplet that starts with 0.
  2. Pop, at random, a triplet that does not start with 0.
  3. Verify if the randomly picked triplet is an intermediate solution.
  4. If not, pop another triplet.
  5. If yes, append the triplet, and REPOPULATE THE TRIPLET QUEUE.
  6. Repeat n times. # or until you exhaust the queue, at which point repeat n times naturally becomes TRUE

从数学上讲,下一个三元组的解决方案可以保证在解决方案集中,因此,如果让它自己耗尽,应该出现随机解决方案。这种方法的问题在于,不能保证每个可能结果都有 equal 概率。

A solution for the next triplet is mathematically guaranteed to be in the solution set, so if you just let it exhaust itself, a randomized solution should appear. The problem with this approach is that there's no guarantee that every possible outcome has an equal probability.

解决方案3:迭代验证

对于等概率结果,请摆脱随机化,并生成每种可能的三元组组合,长列出n个列表,并验证每个候选解决方案。

For equal probability results, get rid of the randomization, and generate every possible 3-tuple combination, n-lists long-- and verify each of those solution candidates.

在候选解决方案列表中编写一个函数 verify 以生成每个解决方案,然后随机弹出一个该列表中的解决方案。

Write a function to verify over the list of candidate solutions to produce every solution, and then randomly pop a solution from that list.

from itertools import combinations

results = [verify(i) for i in combinations(candidates, n)]
# this is 10626 calls to verify for n=4, 5 million for n=5 
# this is not an acceptable solution.  

解决方案1或3都不很快,O(n ** 2),但是根据您的条件,这可能会和您想要真正随机的解决方案一样快。解决方案2将保证是这三种方法中最快的,通常是明显超过1或3,解决方案3具有最稳定的结果。您选择哪种方法取决于您要对输出执行的操作。

Neither Solution 1 or 3 is very fast, O(n**2), but given your criteria, it's possible this is as fast as it'll get if you want a truly random solution. Solution 2 will guaranteed be the fastest of these three, often times significantly beating 1 or 3, Solution 3 has the most stable results. Which of these approaches you choose will depend on what you want to do with the output.

随后:

最终,代码的速度将完全取决于随机性如何您希望您的代码成为。满足您的需求的元组序列的非常第一个(也是唯一第一个)实例吐出的算法可以运行得非常快,因为它只按顺序攻击排列一次,并且将在O(n)时间内运行。但是,它不会随机执行任何操作...

Ultimately, the speed of the code will be contingent on exactly how random you want your code to be. An algorithm to spit out the VERY first (and only the very first) instance of a tuple series that satisfies your requirement can run supremely quickly, as it just attacks the permutations in order, once, and it will run in O(n) time. However, it will not do anything randomly...

此外,这是一些verify(i)的快速代码。根据观察,两个元组在同一索引中可能没有相同的数字。

Also, here's some quick code for verify(i). It's based on the observation that two tuples may not have the same number in the same index.

def verify(t):
    """ Verifies that a set of tuples satisfies the combinations without duplicates condition. """
    zipt = zip(*t)
    return all([len(i) == len(set(i)) for i in zipt])

n = 4个完整解决方案集

n = 4 Full Solution Set

((0, 1, 2), (1, 0, 3), (2, 3, 0), (3, 2, 1))
((0, 1, 2), (1, 0, 3), (2, 3, 1), (3, 2, 0))
((0, 1, 2), (1, 2, 3), (2, 3, 0), (3, 0, 1))
((0, 1, 2), (1, 3, 0), (2, 0, 3), (3, 2, 1))
((0, 1, 3), (1, 0, 2), (2, 3, 0), (3, 2, 1))
((0, 1, 3), (1, 0, 2), (2, 3, 1), (3, 2, 0))
((0, 1, 3), (1, 2, 0), (2, 3, 1), (3, 0, 2))
((0, 1, 3), (1, 3, 2), (2, 0, 1), (3, 2, 0))
((0, 2, 1), (1, 0, 3), (2, 3, 0), (3, 1, 2))
((0, 2, 1), (1, 3, 0), (2, 0, 3), (3, 1, 2))
((0, 2, 1), (1, 3, 0), (2, 1, 3), (3, 0, 2))
((0, 2, 1), (1, 3, 2), (2, 0, 3), (3, 1, 0))
((0, 2, 3), (1, 0, 2), (2, 3, 1), (3, 1, 0))
((0, 2, 3), (1, 3, 0), (2, 0, 1), (3, 1, 2))
((0, 2, 3), (1, 3, 2), (2, 0, 1), (3, 1, 0))
((0, 2, 3), (1, 3, 2), (2, 1, 0), (3, 0, 1))
((0, 3, 1), (1, 0, 2), (2, 1, 3), (3, 2, 0))
((0, 3, 1), (1, 2, 0), (2, 0, 3), (3, 1, 2))
((0, 3, 1), (1, 2, 0), (2, 1, 3), (3, 0, 2))
((0, 3, 1), (1, 2, 3), (2, 1, 0), (3, 0, 2))
((0, 3, 2), (1, 0, 3), (2, 1, 0), (3, 2, 1))
((0, 3, 2), (1, 2, 0), (2, 1, 3), (3, 0, 1))
((0, 3, 2), (1, 2, 3), (2, 0, 1), (3, 1, 0))
((0, 3, 2), (1, 2, 3), (2, 1, 0), (3, 0, 1))

n = 5有552个唯一解。这是前20个。

n = 5 has 552 unique solutions. Here are the first 20.

((0, 1, 2), (1, 0, 3), (2, 3, 4), (3, 4, 0), (4, 2, 1))
((0, 1, 2), (1, 0, 3), (2, 3, 4), (3, 4, 1), (4, 2, 0))
((0, 1, 2), (1, 0, 3), (2, 4, 0), (3, 2, 4), (4, 3, 1))
((0, 1, 2), (1, 0, 3), (2, 4, 1), (3, 2, 4), (4, 3, 0))
((0, 1, 2), (1, 0, 4), (2, 3, 0), (3, 4, 1), (4, 2, 3))
((0, 1, 2), (1, 0, 4), (2, 3, 1), (3, 4, 0), (4, 2, 3))
((0, 1, 2), (1, 0, 4), (2, 4, 3), (3, 2, 0), (4, 3, 1))
((0, 1, 2), (1, 0, 4), (2, 4, 3), (3, 2, 1), (4, 3, 0))
((0, 1, 2), (1, 2, 0), (2, 3, 4), (3, 4, 1), (4, 0, 3))
((0, 1, 2), (1, 2, 0), (2, 4, 3), (3, 0, 4), (4, 3, 1))
((0, 1, 2), (1, 2, 3), (2, 0, 4), (3, 4, 0), (4, 3, 1))
((0, 1, 2), (1, 2, 3), (2, 0, 4), (3, 4, 1), (4, 3, 0))
((0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 0), (4, 0, 1))
((0, 1, 2), (1, 2, 3), (2, 4, 0), (3, 0, 4), (4, 3, 1))
((0, 1, 2), (1, 2, 3), (2, 4, 1), (3, 0, 4), (4, 3, 0))
((0, 1, 2), (1, 2, 4), (2, 0, 3), (3, 4, 0), (4, 3, 1))
((0, 1, 2), (1, 2, 4), (2, 0, 3), (3, 4, 1), (4, 3, 0))
((0, 1, 2), (1, 2, 4), (2, 3, 0), (3, 4, 1), (4, 0, 3))
((0, 1, 2), (1, 2, 4), (2, 3, 1), (3, 4, 0), (4, 0, 3))
((0, 1, 2), (1, 2, 4), (2, 4, 3), (3, 0, 1), (4, 3, 0))

因此,您可以生成这样的解决方案,但这需要时间。如果要利用这一点,我将缓存按原样生成的解决方案,然后在需要n时从它们中随机抽取它们。顺便提一句,n = 5花了不到一分钟的时间才能完成蛮力。由于解为O(n ** 2),所以我希望n = 6花费一个小时,一天中n = 7。获得真正的随机解决方案的唯一方法就是这样做。

So, you can generate solutions like this, but it takes time. If you were going to utilize this, I would cache the solutions generated as is, and then randomly pull from them when you need them for whatever number n. Incidentally, n=5 took a little under a minute to complete, brute force. Since the solution is O(n**2), I expect n=6 to take over an hour, n=7 over a day. The only way you can get return a true randomized solution is by doing it this way.

编辑:不等分布的随机解:

以下是代码I为解决此问题而写的解决方案2 的实现。我认为我会发布它,因为它是部分不平等的分配解决方案,并且在足够的时间内保证生成所有可能的解决方案

The following is code I wrote in attempting to solve this problem, an implementation of Solution 2. I figured I would post it, since it is a partial, non-equal distribution solution, and generates every possible solution, guaranteed, given enough time.

def seeder(n):
    """ Randomly generates the first element in a solution. """
    seed = [0]
    a = range(1, n)
    for i in range(1, 3):
        seed.append(a.pop(random.randint(0,len(a)-1)))
    return [seed]

def extend_seed(seed, n):
    """ Semi-Randomly generates the next element in a solution. """
    next_seed = [seed[-1][0] + 1]
    candidates = range(0, n)
    for i in range(1, 3):
        c = candidates[:]
        for s in next_seed:
            c.remove(s)
        for s in seed:
            try:
                c.remove(s[i])
            except ValueError:
                pass
        next_seed.append(c.pop(random.randint(0,len(c)-1)))
    seed.append(next_seed)
    return seed

def combo(n):
    """ Extends seed until exhausted. 
    Some random generations return results shorter than n. """
    seed = seeder(n)
    while True:
        try:
            extend_seed(seed, n)
        except (ValueError, IndexError):
            return seed

def combos(n):
    """ Ensures that the final return is of length n. """
    while True:
        result = combo(n)
        if len(result) == n:
            return result

这篇关于创建两个列表中项目的所有可能组合的元组,而不必在元组中重复项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆