子集和问题 [英] Subset sum problem

查看:18
本文介绍了子集和问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在计数时遇到问题,这是这个问题的延续.我不是一个真正的数学人,所以我真的很难弄清楚这个被建议作为解决方案的子集求和问题.

I'm having a problem with counting which is continuation of this question. I am not really a math person so it's really hard for me to figure out this subset sum problem which was suggested as resolution.

我有 4 个 ArrayList,其中保存数据:alId、alTransaction、alNumber、alPrice

I'm having 4 ArrayList in which I hold data: alId, alTransaction, alNumber, alPrice

类型 |交易 |数量 |价格
8 |购买 |95.00000000 |305.00000000
8 |购买 |126.00000000 |305.00000000
8 |购买 |93.00000000 |306.00000000
8 |转出 |221.00000000 |305.00000000
8 |转入 |221.00000000 |305.00000000
8 |卖 |93.00000000 |360.00000000
8 |卖 |95.00000000 |360.00000000
8 |卖 |126.00000000 |360.00000000
8 |购买 |276.00000000 |380.00000000

Type | Transaction | Number | Price
8 | Buy | 95.00000000 | 305.00000000
8 | Buy | 126.00000000 | 305.00000000
8 | Buy | 93.00000000 | 306.00000000
8 | Transfer out | 221.00000000 | 305.00000000
8 | Transfer in | 221.00000000 | 305.00000000
8 | Sell | 93.00000000 | 360.00000000
8 | Sell | 95.00000000 | 360.00000000
8 | Sell | 126.00000000 | 360.00000000
8 | Buy | 276.00000000 | 380.00000000

最后,我试图获得留给客户的东西以及我将剩下的东西放入 3 个数组列表中:
- alNewHowMuch(对应于alNumber),
- alNewPrice(对应于alPrice),
- alNewInID(对应于 alID)

In the end I'm trying to get what's left for customer and what's left I put into 3 array lists:
- alNewHowMuch (corresponds to alNumber),
- alNewPrice (corresponds to alPrice),
- alNewInID (corrseponds to alID)

        ArrayList alNewHowMuch = new ArrayList();
        ArrayList alNewPrice = new ArrayList();
        ArrayList alNewInID = new ArrayList();
        for (int i = 0; i < alTransaction.Count; i++) {
            string transaction = (string) alTransaction[i];
            string id = (string) alID[i];
            decimal price = (decimal) alPrice[i];
            decimal number = (decimal) alNumber[i];
            switch (transaction) {
                case "Transfer out":
                case "Sell":
                    int index = alNewHowMuch.IndexOf(number);
                    if (index != -1) {
                        alNewHowMuch.RemoveAt(index);
                        alNewPrice.RemoveAt(index);
                        alNewInID.RemoveAt(index);
                    } else {
                        ArrayList alTemp = new ArrayList();
                        decimal sum = 0;
                        for (int j = 0; j < alNewHowMuch.Count; j ++) {
                            string tempid = (string) alNewInID[j];
                            decimal tempPrice = (decimal) alNewPrice[j];
                            decimal tempNumbers = (decimal) alNewHowMuch[j];
                            if (id == tempid && tempPrice == price) {
                                alTemp.Add(j);
                                sum = sum + tempNumbers;
                            }
                        }
                        if (sum == number) {
                            for (int j = alTemp.Count - 1; j >= 0; j --) {
                                int tempIndex = (int) alTemp[j];
                                alNewHowMuch.RemoveAt(tempIndex);
                                alNewPrice.RemoveAt(tempIndex);
                                alNewInID.RemoveAt(tempIndex);
                            }
                        }
                    }
                    break;
                case "Transfer In":
                case "Buy":
                    alNewHowMuch.Add(number);
                    alNewPrice.Add(price);
                    alNewInID.Add(id);
                    break;
            }
        }

基本上,我是根据交易类型、交易 ID 和数字从数组中添加和删除内容.我正在向 ArrayList 添加数字,如 156、340(当它是 TransferIn 或 Buy)等,然后我删除它们,像 156、340(当它是 TransferOut、Sell).我的解决方案可以毫无问题地解决这个问题.我遇到的问题是,对于一些旧数据,员工输入的总和类似于 1500,而不是 500+400+100+500.我将如何更改它,以便当有 Sell/TransferOutBuy/Transfer In 并且 ArrayList 内没有匹配项时,它应该尝试从 ArrayList 添加多个项目 并找到组合成聚合的元素.

Basically I'm adding and removing things from Array depending on Transaction Type, Transaction ID and Numbers. I'm adding numbers to ArrayList like 156, 340 (when it is TransferIn or Buy) etc and then i remove them doing it like 156, 340 (when it's TransferOut, Sell). My solution works for that without a problem. The problem I have is that for some old data employees were entering sum's like 1500 instead of 500+400+100+500. How would I change it so that when there's Sell/TransferOut or Buy/Transfer In and there's no match inside ArrayList it should try to add multiple items from thatArrayList and find elements that combine into aggregate.

在我的代码中,我试图通过在没有匹配项(索引 == 1)时简单地对所有内容求和来解决该问题

Inside my code I tried to resolve that problem with simple summing everything when there's no match (index == 1)

                    int index = alNewHowMuch.IndexOf(number);
                    if (index != -1) {
                        alNewHowMuch.RemoveAt(index);
                        alNewPrice.RemoveAt(index);
                        alNewInID.RemoveAt(index);
                    } else {
                        ArrayList alTemp = new ArrayList();
                        decimal sum = 0;
                        for (int j = 0; j < alNewHowMuch.Count; j ++) {
                            string tempid = (string) alNewInID[j];
                            decimal tempPrice = (decimal) alNewPrice[j];
                            decimal tempNumbers = (decimal) alNewHowMuch[j];
                            if (id == tempid && tempPrice == price) {
                                alTemp.Add(j);
                                sum = sum + tempNumbers;
                            }
                        }
                        if (sum == number) {
                            for (int j = alTemp.Count - 1; j >= 0; j --) {
                                int tempIndex = (int) alTemp[j];
                                alNewHowMuch.RemoveAt(tempIndex);
                                alNewPrice.RemoveAt(tempIndex);
                                alNewInID.RemoveAt(tempIndex);
                            }
                        }
                    }

但它只有在满足某些条件时才有效,其余条件则失败.

But it only works if certain conditions are met, and fails for the rest.

由于你们中的一些人对我的波兰语变量名称感到如此惊讶(和蒙蔽),为了简单和可见,我将它们全部翻译成英文.希望这能帮助我获得一些帮助:-)

Since some of you were so astonished (and blinded) by my polish variable names i translated all of them to english for simplicity and visiblity. Hopefully this will help me to get some help :-)

推荐答案

你应该如何做到这一点取决于一些重要的事情:你有多少个数字,它们有多大?另外,据我所知,您的数据可以更改(添加/删除数字等),对吗?.您需要多久进行一次这些查询?

How you should do this depends on a number important things: how many numbers will you have and how big will they be? Also, as far as I understand, your data can change (add / remove numbers etc.), right?. How often do you need to make these queries?

我将介绍两种解决方案.我建议你使用第二个,因为我怀疑它更适合你的需要,而且更容易理解.

I'll present two solutions. I suggest you use the second, as I suspect it's better for what you need and it's a lot easier to understand.

解决方案 1 - 动态规划

S[i] = true 如果我们可以使和 i 否则为 false.

S[0] = true // we can always make sum 0: just don't choose any number
S[i] = false for all i != 0
for each number i in your input
    for s = MaxSum downto i
        if ( S[s - i] == true )
            S[s] = true; // if we can make the sum s - i, we can also make the sum s by adding i to the sum s - i.

要获得构成总和的实际数字,您应该保留另一个向量P[i] = 用于计算总和 i 的最后一个数字.您可以在上面的 if 条件中相应地更新它.

To get the actual numbers that make up your sum you should keep another vector P[i] = the last number that was used to make sum i. You would update this accordingly in the if condition above.

这个的时间复杂度是O(numberOfNumbers * maxSumOfAllNumbers),这很糟糕,特别是因为你必须在数据发生变化时重新运行这个算法.只要您的数字可能非常大并且您可以拥有很多,即使是一次运行也很慢.事实上,很多"是有误导性的.如果您有 100 个数字,并且每个数字可以大到 10 000,那么每次数据更改时,您将执行大约 100 * 10 000 = 1 000 000 次操作.

The time complexity of this is O(numberOfNumbers * maxSumOfAllNumbers), which is pretty bad, especially since you have to rerun this algorithm whenever your data changes. It's also slow for even one run as long as your numbers can be very big and you can have a lot of them. In fact, "a lot" is misleading. If you have 100 numbers and each number can be as big as 10 000, you will do roughly 100 * 10 000 = 1 000 000 operations each time your data changes.

知道这是一个很好的解决方案,但在实践中并不是很有用,或者至少我认为在你的情况下不是.

It's a good solution to know, but not really useful in practice, or at least not in your case I think.

他是我建议的方法的一些 C#:

He's some C# for the approach I suggest:

   class Program
      {
        static void Main(string[] args)
        {
            List<int> testList = new List<int>();

            for (int i = 0; i < 1000; ++i)
            {
                testList.Add(1);
            }

            Console.WriteLine(SubsetSum.Find(testList, 1000));

            foreach (int index in SubsetSum.GetLastResult(1000))
            {
                Console.WriteLine(index);
            }
        }
    }

    static class SubsetSum
    {
        private static Dictionary<int, bool> memo;
        private static Dictionary<int, KeyValuePair<int, int>> prev;

        static SubsetSum()
        {
            memo = new Dictionary<int, bool>();
            prev = new Dictionary<int, KeyValuePair<int, int>>();
        }

        public static bool Find(List<int> inputArray, int sum)
        {
            memo.Clear();
            prev.Clear();

            memo[0] = true;
            prev[0] = new KeyValuePair<int,int>(-1, 0);

            for (int i = 0; i < inputArray.Count; ++i)
            {
                int num = inputArray[i];
                for (int s = sum; s >= num; --s)
                {
                    if (memo.ContainsKey(s - num) && memo[s - num] == true)
                    {
                        memo[s] = true;

                        if (!prev.ContainsKey(s))
                        {
                            prev[s] = new KeyValuePair<int,int>(i, num);
                        }
                    }
                }
            }

            return memo.ContainsKey(sum) && memo[sum];
        }

        public static IEnumerable<int> GetLastResult(int sum)
        {
            while (prev[sum].Key != -1)
            {
                yield return prev[sum].Key;
                sum -= prev[sum].Value;
            }
        }
    }

您可能应该进行一些错误检查,并且可能将最后一个总和存储在类中,以免有可能使用与总和 FindGetLastResultcode> 最后被调用.无论如何,这就是想法.

You should do some error checking perhaps, and maybe store the last sum in the class so as not to allow the possibility of calling GetLastResult with a different sum than the sum Find was last called with. Anyway, this is the idea.

解决方案 2 - 随机算法

现在,这更容易了.保留两个列表:usedNumsunusedNums.还要保留一个变量 usedSum,它在任何时候都包含 usedNums 列表中所有数字的总和.

Now, this is easier. Keep two lists: usedNums and unusedNums. Also keep a variable usedSum that, at any point in time, contains the sum of all the numbers in the usedNums list.

每当您需要将一个数字插入到您的集合中时,也将其添加到两个列表之一中(哪个无关紧要,但随机进行,这样分布相对均匀).相应地更新 usedSum.

Whenever you need to insert a number into your set, also add it to one of the two lists (doesn't matter which, but do it randomly so there's a relatively even distribution). Update usedSum accordingly.

每当你需要从你的集合中删除一个数字时,找出它在两个列表中的哪一个.你可以用线性搜索来做到这一点,只要你没有很多(这次很多意味着超过 10000,甚至可能是 100 000 在一台快速的计算机上,并且假设您不经常和快速连续地执行此操作.无论如何,如果需要,可以优化线性搜索.).找到号码后,将其从列表中删除.相应地更新 usedSum.

Whenever you need to remove a number from your set, find out which of the two lists it's in. You can do this with a linear seach as long as you don't have a lot (this time a lot means over 10 000, maybe even 100 000 on a fast computer and assuming you don't do this operation often and in fast succession. Anyway, the linear search can be optimized if you need it to be.). Once you have found the number, remove it from the list. Update usedSum accordingly.

每当您需要查找集合中是否存在总和为数字 S 的数字时,请使用此算法:

Whenever you need to find if there are numbers in your set that sum to a number S, use this algorithm:

while S != usedSum
    if S > usedSum // our current usedSum is too small
        move a random number from unusedNums to usedNums and update usedSum
    else // our current usedSum is too big
        move a random number from usedNums to unusedNums and update usedSum

在算法的最后,列表 usedNums 会给你总和为 S 的数字.

At the end of the algorithm, the list usedNums will give you the numbers whose sum is S.

我认为这个算法应该适合你的需要.它可以很好地处理对数据集的更改,并且适用于大量计数.它也不取决于数字有多大,如果你有大数字,这非常有用.

This algorithm should be good for what you need, I think. It handles changes to the dataset very well and works well for a high number count. It also doesn't depend on how big the numbers are, which is very useful if you have big numbers.

如有问题请留言.

这篇关于子集和问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆