在vb.net中查找列表值的所有组合(笛卡尔积) [英] Finding All Combinations (cartesian product) of list values in vb.net

查看:86
本文介绍了在vb.net中查找列表值的所有组合(笛卡尔积)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何生成N个可变长度的vb列表中N个值的所有组合?

How can I produce all of the combinations of the values in N number of vb list of variable lengths?

假设我有N个vb列表,例如

Let's say I have N number of vb lists, e.g.

first = {'a', 'b', 'c', 'd'}
second = {'e'}
third =  {'f', 'g', 'h', 'i', 'j'}

(在此示例中为三个列表,但针对该问题的列表为N个.)

(Three list in this example, but its N number of lists for the problem.)

我想输出其值的所有组合,以按顺序生成列表列表.

And I want to output all the combinations of their values, to produce a list of lists in the order.

{
{a,e,f}
{a,e,g}
{a,e,h}
{a,e,i}
{a,e,j}
{b,e,f}
{b,e,g}
....
{d,e,j}
}

推荐答案

简介

您要执行的操作称为:笛卡尔积

What you want to do is called: cartesian product

让我们先命名吧.我将把您的输入列表命名为L_i,其中1 <= i <= n.我还将命名为S_i输入列表的大小L_i.

Let's do some naming before going further. I will name your input lists L_i where 1 <= i <= n. I will also name S_i the size of the input list L_i.

我们可以问一个问题:what is the size of the output ?

We could ask the question: what is the size of the output ?

如果只有一个列表L_1,将有S_1个输出列表,每个列表仅包含L_1的一个元素.

If there is only one list L_1, there will be S_1 output lists, each one containing exactly one element of L_1.

如果有两个列表{L_1, L_2}.对于L_1的每个元素,我都可以附加S_2L_2不同元素.由于L_1S_1个元素,因此使S_1*S_2个不同的输出列表.

If there are two lists {L_1, L_2}. For each element of L_1 I can append S_2 different elements of L_2. As there are S_1 elements of L_1 it makes S_1*S_2 different output lists.

我们可以继续对n列表进行推理,并证明输出列表的数量为:S_1*...*S_n.

We can continue the reasoning to n lists and prove that the number of output lists will be: S_1*...*S_n.

这对我们有什么帮助?因为我们现在可以在数字i和输出列表之间创建映射.

How does that help us ? Because we can now create a mapping between a number i and an output list.

给出i一个数字0<=i<S_1*...*S_n,输出列表包含:

Given i a number 0<=i<S_1*...*S_n, the output list contains:

element of L_1 at index i/(S_2*S_3*...*S_n) MOD S_1
element of L_2 at index i/(    S_3*...*S_n) MOD S_2
...
element of L_n at index i                   MOD S_n

实施示例

我不了解VB.net,所以我选择了使用相同.net平台的C#. 我决定使用yield return函数,以便我们不会分配比所需更多的内存.如果只需要打印输出,它将只消耗一个ulong内存,而不是分配很大的输出列表.

I don't know VB.net so I chose C# which uses the same .net platform. I decided to use a yield return function so that we don't allocate more memory than needed. If you just need to print the outputs it will only consume a single ulong of memory instead of allocating a very big list of output lists.

using System;
using System.Collections.Generic;
using System.Linq;

namespace cartesian_product
{
    class Program
    {
        public static IEnumerable<List<T>> cartesian_product<T>(IEnumerable<List<T>> lists)
        {
            ulong MAX_I = lists.Select(l => (ulong)l.Count)
                               .Aggregate(1ul, (a, b) => a * b);

            for (ulong i = 0; i < MAX_I; ++i)
            {
                var output = new List<T>();

                ulong div = MAX_I;
                ulong index, s_i;
                foreach (var l in lists)
                {
                    s_i    = (ulong)l.Count;
                    div   /= s_i;
                    index  = (i/div)%s_i;
                    output.Add(l[(int)index]);
                }
                yield return output;
            }
        }

        static void Main(string[] args)
        {
            var first = new List<Char> {'a', 'b', 'c', 'd'};
            var second = new List<Char> {'e'};
            var third = new List<Char> {'f', 'g', 'h', 'i', 'j'};

            Console.WriteLine("{");
            foreach (var output in cartesian_product(new[]{first, second, third}))
            {
                Console.WriteLine("{{{0}}}", string.Join(",", output));
            }
            Console.WriteLine("}");
        }
    }
}

输出为:

{
{a,e,f}
{a,e,g}
{a,e,h}
{a,e,i}
{a,e,j}
{b,e,f}
{b,e,g}
{b,e,h}
{b,e,i}
{b,e,j}
{c,e,f}
{c,e,g}
{c,e,h}
{c,e,i}
{c,e,j}
{d,e,f}
{d,e,g}
{d,e,h}
{d,e,i}
{d,e,j}
}

限制

一个人可能会问:what if the product of the lists length overflows the variable used to index the outputs ?.

这是一个实际的理论问题,但是我在代码中使用了ulong,如果输出列表的总数使该变量溢出,则无论您使用哪种方法,都几乎没有机会枚举输出. (因为理论输出将包含超过2^64个列表).

This is a real theoretical problem, but I use a ulong in my code and if the total number of ouput lists overflows this variable there is little chance that you can enumerate the output whatever method you use. (because the theoretical output will contain more than 2^64 lists).

应用

OP并没有解释为什么他首先需要这种算法.因此,读者可能会想why is this useful ?.其中一个原因可能是生成用于回归测试的测试用例.假设您有一个遗留函数,将三个变量作为输入.您可以为每个参数生成一些可能的值,并为每个可能的参数集使用笛卡尔乘积收集函数的结果.重构遗留代码后,可以通过比较新代码输出和遗留代码输出来确保没有回归.

The OP didn't explain why he needed this algorithm in the first place. So the reader may wonder why is this useful ?. One reason among others may be to generate test cases for regression testing. Let's say you have a legacy function taking as input three variables. You could generate some possible values for each of the parameters and using the cartesian product collect result of the function for each possible set of parameters. After refactoring the legacy code, you could ensure there is no regression by comparing the new code output and the legacy code output.

这篇关于在vb.net中查找列表值的所有组合(笛卡尔积)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆