在排序后的数组中找到三个元素,这些元素的总和为第四个元素 [英] Find three elements in a sorted array which sum to a fourth element

查看:75
本文介绍了在排序后的数组中找到三个元素,这些元素的总和为第四个元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的一个朋友最近遇到了这个面试问题,在我们看来这是可以解决的,但不在面试官认为应该可行的渐近时间范围内。问题出在这里:

A friend of mine recently got this interview question, which seems to us to be solvable but not within the asymptotic time bounds that the interviewer thought should be possible. Here is the problem:


您有一个 N 个整数数组, xs ,已排序但可能不明显。您的目标是找到四个数组索引(1) (a,b,c,d),使以下两个属性成立:

You have an array of N integers, xs, sorted but possibly non-distinct. Your goal is to find four array indices(1) (a,b,c,d) such that the following two properties hold:

xs[a] + xs[b] + xs[c] = xs[d]

a < b < c < d

目标是在O(N 2 )时间内完成此操作

The goal is to do this in O(N2) time.

首先,一个O(N 3 log(N))解决方案显而易见:对于每个(a,b,c)有序三元组,请使用二进制搜索查看是否可以找到合适的 d 。现在,如何做得更好?

First, an O(N3log(N)) solution is obvious: for each (a,b,c) ordered triple, use binary search to see if an appropriate d can be found. Now, how to do better?

面试官的一个有趣建议是将第一个条件重写为:

One interesting suggestion from the interviewer is to rewrite the first condition as:

xs[a] + xs[b] = xs[d] - xs[c]

目前尚不清楚该怎么做,但是也许我们可以选择一些枢轴值P,然后搜索(a,b)对加起来等于P,然后减去(d,c)对。通过从数组的两端向内搜索,对于给定的P来说,搜索很容易在O(n)时间内完成。但是,在我看来,问题在于存在N 2 个这样的值P,而不仅仅是N个,因此我们实际上并没有减小问题的大小:

It's not clear what to do after this, but perhaps we could chose some pivot value P, and search for an (a,b) pair adding up to P, and a (d,c) pair subtracting to it. That search is easy enough to do in O(n) time for a given P, by searching inwards from both ends of the array. However, it seems to me that the problem with this is that there are N2 such values P, not just N of them, so we haven't actually reduced the problem size at all: we're doing O(N) work, O(N2) times.

我们发现其他相关问题正在其他地方在线讨论:可以在N 2 时间内在数组中找到3个加到给定总和的数字是可解的,但要求总和预先固定;改编相同的算法,但是遍历每个可能的总和,使我们一如既往地处于N 3

We found some related problems being discussed online elsewhere: Find 3 numbers in an array adding to a given sum is solvable in N2 time, but requires that the sum be fixed ahead of time; adapting the same algorithm but iterating through each possible sum leaves us at N3 as always.

另一个相关的问题似乎是在数组中查找总和小于或等于给定总和的所有三元组,但我不确定此处有多少相关的内容:不平等而不是平等会把事情混在一起,而且目标当然是固定的,而不是变化的。

Another related problem seems to be Find all triplets in array with sum less than or equal to given sum, but I'm not sure how much of the stuff there is relevant here: an inequality rather than an equality mixes things up quite a bit, and of course the target is fixed rather than varying.

那么,我们缺少什么呢?考虑到性能要求,问题毕竟不可能吗?还是有一种我们无法发现的聪明算法?

So, what are we missing? Is the problem impossible after all, given the performance requirements? Or is there a clever algorithm we're unable to spot?

(1)实际上,问题在于找到所有这样的(a,b,c,d)元组,并返回计数。但是我认为,即使在所需的时间限制中找到其中一个也很难。

(1) Actually the problem as posed is to find all such (a,b,c,d) tuples, and return a count of how many there are. But I think even finding a single one of them in the required time constraints is hard enough.

推荐答案

如果算法必须 list 解决方案(即 a b c d 满足条件),最差的时间复杂度是 O(n 4

If the algorithm would have to list the solutions (i.e. the sets of a, b, c, and d that satisfy the condition), the worst case time complexity is O(n4):

平凡的例子是其中只有0个值的数组。然后 a b c d 拥有所有自由,只要它们保持秩序。这表示 O(n 4 个解决方案。

The trivial example is an array with only 0 values in it. Then a, b, c and d have all the freedom as long as they stay in order. This represents O(n4) solutions.

但是更普遍地,遵循以下模式的数组具有 O(n 4 解决方案:

But more generally arrays which follow the following pattern have O(n4) solutions:

w, w, w, ... x, x, x, ..., y, y, y, ...  z, z, z, ....

每次出现的次数相同,并且:

With just as many occurrences of each, and:

w + x + y = z

但是,仅产生解的 number 个算法,时间复杂度更好。

However, to only produce the number of solutions, an algorithm can have a better time complexity.

这是已发布算法的细微变化,不涉及 H 因素。它还描述了如何处理不同配置导致总和相同的情况。

This is a slight variation of the already posted algorithm, which does not involve the H factor. It also describes how to handle cases where different configurations lead to the same sums.


  • 检索所有对并将它们存储在数组中 X ,其中每个元素获取以下信息:

  • Retrieve all pairs and store them in an array X, where each element gets the following information:

a :两者中的最小索引

b :其他索引

总和 xs [a] + xs [b]的值

a: the smallest index of the two
b: the other index
sum: the value of xs[a] + xs[b]

同时还将每个这样的货币对存储在另一个数组 Y 中,以下内容:

At the same time also store for each such pair in another array Y, the following:

c :两个索引中的最小索引

d :另一个索引< br>
sum xs [d]-xs [c]

c: the smallest index of the two
d: the other index
sum: the value of xs[d] - xs[c]

上述操作的时间复杂度为 O(n²)

The above operation has a time complexity of O(n²)


  • 将两个数组按其元素的 sum 属性排序。如果 sum 值相等,则排序顺序将如下确定:对于 X 数组,通过增加 b 来确定;通过减小 c 来添加 Y 数组。可以在 O(n²) O(n²logn)时间内进行排序。

  • Sort both arrays by their element's sum attribute. In case of equal sum values, the sort order will be determined as follows: for the X array by increasing b; for the Y array by decreasing c. Sorting can be done in O(n²) O(n²logn) time.

[编辑:我无法证明先前的 O(n²)要求(除非做出一些假设以允许使用基数/存储桶排序算法,否则我会这样做)不承担)。如评论中所述,通常可以将具有元素的数组排序为 O(n²logn²),即 O(n²logn),但是不是 O(n²)]

[ I could not prove the earlier claim of O(n²) (unless some assumptions are made that allow for a radix/bucket sorting algorithm, which I will not assume). As noted in comments, in general an array with elements can be sorted in O(n²logn²), which is O(n²logn), but not O(n²)]


  • 以串联方式遍历两个数组以找到相等的总和。如果是这种情况,则需要检查 X [i] .b< Y [j] .c 。如果是这样,则代表解决方案。但是可能有很多,并且在可接受的时间内对它们进行计数需要特别注意。

  • Go through both arrays in "tandem" to find pairs of sums that are equal. If that is the case, it needs to be checked that X[i].b < Y[j].c. If so it represents a solution. But there could be many of them, and counting those in an acceptable time needs special care.

m = n(n-1)/ 2 ,即数组中的元素数X (也就是数组 Y 的大小):

Let m = n(n-1)/2, i.e. the number of elements in array X (which is also the size of array Y):

    i = 0
    j = 0
    while i < m and j < m:
        if X[i].sum < Y[j].sum:
            i = i + 1
        elif X[i].sum > Y[j].sum:
            j = j + 1
        else:
            # We have a solution. Need to count all others that have same sums in X and Y.
            # Find last match in Y and set k as index to it:
            countY = 0
            while k < m and X[i].sum == Y[j].sum and X[i].b < Y[j].c:
                countY = countY + 1
                j = j + 1
            k = j - 1
            # add chunks to `count`:
            while i < m and countY >= 0 and X[i].sum == Y[k].sum:
                while countY >= 0 and X[i].b >= Y[k].c:
                    countY = countY - 1
                    k = k - 1
                count = count + countY
                i = i + 1

请注意,尽管存在嵌套循环,变量 i 只会递增, j 也是如此。变量 k 总是在最里面的循环中递减。尽管从一开始它也会获得更高的值,但是通过 k 索引,它对同一个 Y 元素的寻址绝不能超过固定的次数,因为在减少此索引的同时,它保持在 Y 的相同金额范围内。

Note that although there are nested loops, the variable i only ever increments, and so does j. The variable k always decrements in the innermost loop. Although it also gets higher values to start from, it can never address the same Y element more than a constant number of times via the k index, because while decrementing this index, it stays within the "same sum" range of Y.

因此,这意味着算法的最后一部分运行在 O(m),即 O(n²)。我的最新编辑确认排序步骤不是 O(n²),该步骤确定了总体时间复杂度: O(n²logn)

So this means that this last part of the algorithm runs in O(m), which is O(n²). As my latest edit confirmed that the sorting step is not O(n²), that step determines the overall time-complexity: O(n²logn).

这篇关于在排序后的数组中找到三个元素,这些元素的总和为第四个元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆