寻找款项阵的快速算法 [英] fast algorithm of finding sums in array

查看:102
本文介绍了寻找款项阵的快速算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要寻找一个快速算法:

I am looking for a fast algorithm:

我有大小为n的int数组,目标是找到阵列中的所有模式,
X1,X2,X3是在阵列中的不同元素,使得X1 + X2 = X3的

I have a int array of size n, the goal is to find all patterns in the array that
x1, x2, x3 are different elements in the array, such that x1+x2 = x3

例如我知道有大小3的int数组是 [1,2,3] 那么就只有一个可能性:1 + 2 = 3(考虑1+ 2 = 2 + 1)的

For example I know there's a int array of size 3 is [1, 2, 3] then there's only one possibility: 1+2 = 3 (consider 1+2 = 2+1)

我想实现对和包含HashMap,使算法快。 (最快的一个我现在得到的是仍为O(n ^ 2))

I am thinking about implementing Pairs and Hashmaps to make the algorithm fast. (the fastest one I got now is still O(n^2))

请分享你的想法对于这个问题,谢谢

Please share your idea for this problem, thank you

推荐答案

修改:下面的答案适用于一个版本的问题,你只需要一个三线,增加了这样的。当您希望所有的人,因为有可能至少为O(n ^ 2)可能的输出(如指出ex0du5),甚至为O(n ^ 3)重复元素病理情况下,你不会击败了简单的为O(n ^ 2)基于散列(映射从价值与价值指数的列表)算法。

Edit: The answer below applies to a version of this problem in which you only want one triplet that adds up like that. When you want all of them, since there are potentially at least O(n^2) possible outputs (as pointed out by ex0du5), and even O(n^3) in pathological cases of repeated elements, you're not going to beat the simple O(n^2) algorithm based on hashing (mapping from a value to the list of indices with that value).

这是基本的3SUM问题。如果没有潜在的无粘结大元素,最有名的算法是约为O(n ^ 2),但我们只证明,它不能快于为O(n LG N)进行计算的大多数型号。

This is basically the 3SUM problem. Without potentially unboundedly large elements, the best known algorithms are approximately O(n^2), but we've only proved that it can't be faster than O(n lg n) for most models of computation.

如果整数元素在范围 [U,V] ,你可以做一个稍微不同的版本,这在 O(N + (VU)LG(VU)) FFT 。我将描述一个过程来改变这个问题转化为一个,解决它在那里,然后找出在此基础上改造的回答你的问题。

If the integer elements lie in the range [u, v], you can do a slightly different version of this in O(n + (v-u) lg (v-u)) with an FFT. I'm going to describe a process to transform this problem into that one, solve it there, and then figure out the answer to your problem based on this transformation.

我知道如何解决与FFT的问题是找到一个数组长度为3的等差数列:即一个序列 A B C Ç - B = B - A ,或者等价地, A + C = 2B

The problem that I know how to solve with FFT is to find a length-3 arithmetic sequence in an array: that is, a sequence a, b, c with c - b = b - a, or equivalently, a + c = 2b.

不幸的是,转型的最后一步是不一样快,我想,但我会讲,当我们到达那里。

Unfortunately, the last step of the transformation back isn't as fast as I'd like, but I'll talk about that when we get there.

让我们把原来的数组 X ,其中包含整数 X_1,...,x_n 。我们希望找到指数Ĵ K 这样的该 x_i + x_j = x_k

Let's call your original array X, which contains integers x_1, ..., x_n. We want to find indices i, j, k such that x_i + x_j = x_k.

  1. 查找最低 U 和最大 v 的X O(N)的时间。让 U分(U,U * 2) V' MAX(V,V * 2)

  1. Find the minimum u and maximum v of X in O(n) time. Let u' be min(u, u*2) and v' be max(v, v*2).

构造一个二进制数组(比特串)以Z 长度 V' - U'+ 1 ; Z [I] 将是真实的,如果任一 X 或它的双 [X_1 * 2, ...,x_n * 2] 包含 U'+ I 。这是 O(N)来初始化;刚走到 X的每个元素,并设置以Z 的两个对应的元素。

Construct a binary array (bitstring) Z of length v' - u' + 1; Z[i] will be true if either X or its double [x_1*2, ..., x_n*2] contains u' + i. This is O(n) to initialize; just walk over each element of X and set the two corresponding elements of Z.

由于我们正在构建这个数组,我们可以节省我们发现任何重复的索引到一个辅助列表。一旦以Z 完成后,我们只是检查 2 * x_i 每个 x_i 。如果有任何present,我们就大功告成了;否则,重复是不相关的,我们可以忘掉。 (稍微复杂一点的唯一情况是,如果 0 重复;然后,我们需要的是三个不同的副本得到解决)

As we're building this array, we can save the indices of any duplicates we find into an auxiliary list Y. Once Z is complete, we just check for 2 * x_i for each x_i in Y. If any are present, we're done; otherwise the duplicates are irrelevant, and we can forget about Y. (The only situation slightly more complicated is if 0 is repeated; then we need three distinct copies of it to get a solution.)

现在,解决你的问题,即 x_i + x_j = x_k ,将出现在以Z 三均匀间隔的,因为一些简单的代数运算给我们 2 * x_j - x_k = x_k - 2 * x_i 。请注意,在末端的元素是我们特别增加了一倍项(从 2 )和一个在中间是一个普通的项目(从 X )。

Now, a solution to your problem, i.e. x_i + x_j = x_k, will appear in Z as three evenly-spaced ones, since some simple algebraic manipulations give us 2*x_j - x_k = x_k - 2*x_i. Note that the elements on the ends are our special doubled entries (from 2X) and the one in the middle is a regular entry (from X).

考虑以Z 作为重presentation多项式 P ,其中系数度的学期我 Z [I] 。如果 X [1,2,3,5] ,然后以Z 1111110001 (因为我们1,2,3,4,5,6,和10); P 是那么的 1 + X + X 2 + X 3 + X 4 + X 5 + X 9 的。

Consider Z as a representation of a polynomial p, where the coefficient for the term of degree i is Z[i]. If X is [1, 2, 3, 5], then Z is 1111110001 (because we have 1, 2, 3, 4, 5, 6, and 10); p is then 1 + x + x2 + x3 + x4 + x5 + x9.

现在,从高中代数记得系数的 X C 的两种多项式的产品的总和在所有的 A,B X A 的时间第二次的系数的 X B <中>与 A + B = C 的/ SUP> 的。所以,如果我们考虑的 Q = P 2 的,的的 X 2J 的系数(为的Ĵ 的与 Z [J] = 1 )将是一笔对所有的的Z [I] * Z [2 *的J - I] 。但是,由于以Z 是二进制的,那是三胞胎数完全的我在,J,K 的它们均匀间隔的人ž。注意的(j,J,J)的始终是这样一个三线,所以我们只关心那些与值> 1

Now, remember from high school algebra that the coefficient of xc in the product of two polynomials is the sum over all a, b with a + b = c of the first polynomial's coefficient for xa times the second's coefficient for xb. So, if we consider q = p2, the coefficient of x2j (for a j with Z[j] = 1) will be the sum over all i of Z[i] * Z[2*j - i]. But since Z is binary, that's exactly the number of triplets i,j,k which are evenly-spaced ones in Z. Note that (j, j, j) is always such a triplet, so we only care about ones with values > 1.

然后,我们可以使用快速傅立叶变换找到的 P 2 的在 O(| Z |登录| Z |)时间,其中<​​code> | Z | 是 V' - U'+ 1 。我们走出系数的另一个数组;称之为是W

We can then use a Fast Fourier Transform to find p2 in O(|Z| log |Z|) time, where |Z| is v' - u' + 1. We get out another array of coefficients; call it W.

遍历每个 x_k X 。 (回想一下,我们所期望的均匀间隔的人都集中 X 的元素,而不是 2 * X )。如果相应的是W 两次这个元素,即 W [2 *(x_k - U'),是1我们知道这不是任何非平凡级数的中心,我们可以跳过它。 (如前指出的,它应该仅是一个正整数。)

Loop over each x_k in X. (Recall that our desired evenly-spaced ones are all centered on an element of X, not 2*X.) If the corresponding W for twice this element, i.e. W[2*(x_k - u')], is 1, we know it's not the center of any nontrivial progressions and we can skip it. (As argued before, it should only be a positive integer.)

否则,它可能是我们希望有一个发展的中心,(因此我们需要找到Ĵ)。但是,不幸的是,它也可能是不具有我们所期望的形式的进展的中心。因此,我们需要检查。遍历其它元素 x_i X ,并检查是否有一个三用 2 * x_i x_k 2 * x_j 一些Ĵ(通过检查 Z [2 *(x_k - x_j) - U'] )。如果是这样,我们有一个答案;如果我们让它在所有 X 的无一命中,那么FFT只发现了虚假的答案,我们必须检查另一个元素是W

Otherwise, it might be the center of a progression that we want (so we need to find i and j). But, unfortunately, it might also be the center of a progression that doesn't have our desired form. So we need to check. Loop over the other elements x_i of X, and check if there's a triple with 2*x_i, x_k, 2*x_j for some j (by checking Z[2*(x_k - x_j) - u']). If so, we have an answer; if we make it through all of X without a hit, then the FFT found only spurious answers, and we have to check another element of W.

因此​​,此最后的步骤是O(n * 1 +(x_k为W的数量[2 *(x_k - U')]> 1是不实际的解决方案)),其是也许可能为O(n ^ 2),这显然是不行的。应该有一种方法来避免输出是W 生成这些虚假的答案;如果我们知道任何适当的是W 系数肯定有一个答案,这最后一步将是 O(N)和所有都会很好。

This last step is therefore O(n * 1 + (number of x_k with W[2*(x_k - u')] > 1 that aren't actually solutions)), which is maybe possibly O(n^2), which is obviously not okay. There should be a way to avoid generating these spurious answers in the output W; if we knew that any appropriate W coefficient definitely had an answer, this last step would be O(n) and all would be well.

我认为这是可以使用稍微不同的多项式要做到这一点,但我还没有得到它的实际工作。我会考虑的更多一些......

I think it's possible to use a somewhat different polynomial to do this, but I haven't gotten it to actually work. I'll think about it some more....

部分基于这个答案

这篇关于寻找款项阵的快速算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆