一种高效的方法来生成所有可能的方法来配对数据集中的项目 [英] An efficient method to generate all possible ways to pair up items in a data set

查看:88
本文介绍了一种高效的方法来生成所有可能的方法来配对数据集中的项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这在某种程度上是一个组合问题;我正在尝试找出一种有效的方法来配对数据集中的所有项目。

This is somewhat of a combinatorial problem; I'm trying to figure out an efficient way to pair up all items in a data set.

例如,我有一个长度为6的数组:[1,2,3,4,5,6],我想对所有可能的配对数组中的内容如下:

For example, I have an array of length 6: [1,2,3,4,5,6], and I want to make all possible pairings of the contents in the array as so:

[1,2],[3,4],[5,6]  
[1,2],[3,5],[4,6]  
[1,2],[3,6],[4,5]  
[1,3],[2,4],[5,6]  
[1,3],[2,5],[4,6]  
...

,依此类推。在此示例中,总共将有15种组合。通常,此解决方案中的可能性应为(N-1)!!其中N是数组的长度或要配对的项目数。在这种情况下,N始终为偶数。理想情况下,算法将连续生成所有可能性,而不必存储非常大的数组。

and so on. In this example, there would be 15 combinations total. In general, the number of possibilities in this solution should be (N-1)!! where N is the length of the array or the number of items to be paired up. N will always be even in this case. Ideally, an algorithm will generate all possibilities serially, without having to store very large arrays.

例如,一种方法的工作方式类似于轮循调度算法,在该算法中,将数组拆分为N / 2:

For example, one way would work somewhat like a 'round robin' scheduling algorithm where you split the array into N/2:

[1,2,3]  
[4,5,6]  

并顺时针旋转[5,3,6]以生成3个唯一的配对(垂直计数),固定[1,2,4]为:

and rotate the [5,3,6] clockwise to generate 3 unique pairings (counting vertically) with [1,2,4] fixed as so:

[1,2,3]  
[4,5,6]

[1,2,5]  
[4,6,3]  

[1,2,6]  
[4,3,5]

然后顺时针旋转[4,2,3,6,5]以生成5个唯一的配对,其中[1]是固定的,从最小的最里面的情况开始( N = 4)向外直到结束,但以递归方式进行。因为我不确定如何最好地将其表达为代码,或者不确定是否有更有效的方法,因为N可能非常大。

and then rotate [4,2,3,6,5] clockwise to generate 5 unique pairings with [1] fixed, going from the smallest innermost case (N=4) outwards until the end, but recursively. As it is I'm not sure how to best express this as code or if there is a more efficient way of doing this, as N can be very large.

推荐答案

您可以使用标准,但是每次您递归时都会沿列表前进2个元素,而列表中的第一个剩余元素是总是在每次递归时输出的对中的第一个元素,对中的第二个是其他其余元素。

You can generate the pairs using the standard recursive algorithm for generating permutations of a list, but with a twist that each time you recurse you advance 2 elements along the list, and the first remaining element in the list is always the first element of the pair you output at each recursion, with the second of the pair being each of the other remaining elements.

始终选择第一个其余元素作为配对中的第一项可以避免使用不同的配对生成相同的配对。

Always choosing the first remaining element as the first item in the pair avoids generating the same pairings with the pairs in different order.

与标准算法一样,您可以在不生成副本的情况下就地生成排列。

As with the standard algorithm, you can generate the permutations in place without making copies of the array, by swapping elements in place, recursing and then swapping back.

这是一些C代码来演示算法(我意识到这个问题被标记为Fortran,但是不要认为它是伪代码)。此代码在递归时传递列表并进行计数,但是您可以使用 items itemcount 作为全局变量来实现它:

Here is some C code to demonstrate the algorithm (I realise this question is tagged Fortran but just think of it as pseudo code). This code passes the list and count when it recurses, but you could implement it with items and itemcount as global variables:

// start is the current position in the list, advancing by 2 each time
// pass 0 as start when calling at the top level
void generatePairings(int* items, int itemcount, int start)
{
    if(itemcount & 1)
        return; // must be an even number of items

    // is this a complete pairing?
    if(start == itemcount)
    {
        // output pairings:
        int i;
        for(i = 0; i<itemcount; i+=2)
        {
            printf("[%d, %d] ", items[i], items[i+1]);
        }
        printf("\n");
        return;
    }

    // for the next pair, choose the first element in the list for the
    // first item in the pair (meaning we don't have to do anything 
    // but leave it in place), and each of the remaining elements for
    // the second item:
    int j;
    for(j = start+1; j<itemcount; j++)
    {
        // swap start+1 and j:
        int temp = items[start+1];
        items[start+1] = items[j];
        items[j] = temp;

        // recurse:
        generatePairings(items, itemcount, start+2);

        // swap them back:
        temp = items[start+1];
        items[start+1] = items[j];
        items[j] = temp;
    }
}

int main(void) {
    int items[6] = {1, 2, 3, 4, 5, 6};
    generatePairings(items, 6, 0);

    return 0;
}

输出:

[1, 2] [3, 4] [5, 6] 
[1, 2] [3, 5] [4, 6] 
[1, 2] [3, 6] [5, 4] 
[1, 3] [2, 4] [5, 6] 
[1, 3] [2, 5] [4, 6] 
[1, 3] [2, 6] [5, 4] 
[1, 4] [3, 2] [5, 6] 
[1, 4] [3, 5] [2, 6] 
[1, 4] [3, 6] [5, 2] 
[1, 5] [3, 4] [2, 6] 
[1, 5] [3, 2] [4, 6] 
[1, 5] [3, 6] [2, 4] 
[1, 6] [3, 4] [5, 2] 
[1, 6] [3, 5] [4, 2] 
[1, 6] [3, 2] [5, 4] 

如果您在大型对象列表上执行此操作,则对索引列表进行置换然后再使用它们对数组进行索引会更有效。对象,而不是对大量数据进行昂贵的交换操作。

If you are doing this on a list of large objects, it's more efficient to permute a list of indices and then use those to index into your array of objects, rather than doing expensive swap operations on large amounts of data.

这篇关于一种高效的方法来生成所有可能的方法来配对数据集中的项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆