O(nlogn)算法 - 查找二进制字符串中的三个均匀间隔的人 [英] O(nlogn) Algorithm - Find three evenly spaced ones within binary string

查看:200
本文介绍了O(nlogn)算法 - 查找二进制字符串中的三个均匀间隔的人的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对一个算法的测试这个问题,昨天,我不能找出答案。这是推动我绝对是疯了,因为它是价值约40点。我想,大多数类并没有解决它正确,因为我还没有想出了在过去24小时内解决。

I had this question on an Algorithms test yesterday, and I can't figure out the answer. It is driving me absolutely crazy, because it was worth about 40 points. I figure that most of the class didn't solve it correctly, because I haven't come up with a solution in the past 24 hours.

鉴于长度为n的任意的二进制串,查找字符串在三个均匀间隔的,如果它们存在。写一个算法,解决了这个为O(N *的log(n))的时间。

Given a arbitrary binary string of length n, find three evenly spaced ones within the string if they exist. Write an algorithm which solves this in O(n * log(n)) time.

所以像这些字符串有三种那些被均匀间隔:11100000,0100100100

So strings like these have three ones that are "evenly spaced": 11100000, 0100100100

编辑:这是随机数,所以它应能适用于任何数目。我给的例子是为了说明均匀分布的属性。所以,1001011是有效的数字。带1,4和7被那些被均匀地间隔开。

edit: It is a random number, so it should be able to work for any number. The examples I gave were to illustrate the "evenly spaced" property. So 1001011 is a valid number. With 1, 4, and 7 being ones that are evenly spaced.

推荐答案

终于来了!继在<一个线索href="http://stackoverflow.com/questions/1560523/onlogn-algorithm-find-three-evenly-spaced-ones-within-binary-string/1579165#1579165">sdcvvc's回答,我们有它:在为O(n log n)的算法问题!这也很简单,你了解它之后。这些谁猜FFT是正确的。

Finally! Following up leads in sdcvvc's answer, we have it: the O(n log n) algorithm for the problem! It is simple too, after you understand it. Those who guessed FFT were right.

问题:我们有一个二进制字符串长度取值 N 的,我们要找到三个均匀间隔1秒在里面。例如,取值可能 110110010 ,其中的 N = 9。它已均匀在位置2,5,和8隔开1秒

The problem: we are given a binary string S of length n, and we want to find three evenly spaced 1s in it. For example, S may be 110110010, where n=9. It has evenly spaced 1s at positions 2, 5, and 8.

  1. 扫描取值左到右,并制作一个列表 1为阵地 S = 110110010 上面,我们有列表L = [1,2,4,5,8]。这一步是O(n)。现在的问题是找到长度的算术级数3 ,即找到不同的 A,B,C 这样的 BA = CB 的,或者等价的 A + C = 2B 。对于上面的例子中,我们希望找到的进展(2,5,8)。

  1. Scan S left to right, and make a list L of positions of 1. For the S=110110010 above, we have the list L = [1, 2, 4, 5, 8]. This step is O(n). The problem is now to find an arithmetic progression of length 3 in L, i.e. to find distinct a, b, c in L such that b-a = c-b, or equivalently a+c=2b. For the example above, we want to find the progression (2, 5, 8).

请在多项式 P 与条件的 X K 的每个 K 的在。对于上面的例子中,我们使多项式的 P(x)=(X + X 2 + X 4 + X 5 + x 8 的。这一步是O(n)。

Make a polynomial p with terms xk for each k in L. For the example above, we make the polynomial p(x) = (x + x2 + x4 + x5+x8). This step is O(n).

查找多项式 = P 2 的,使用的Fast傅立叶变换。对于上面的例子中,我们得到的多项式的 Q(X)= X 16 + 2× 13 + 2× 12 + 3X 10 + 4X 9 + X 8 + 2× 7 + 4X 6 + 2X 5 + X 4 + 2× 3 + X 2 的。 这一步是O(n log n)的。

Find the polynomial q = p2, using the Fast Fourier Transform. For the example above, we get the polynomial q(x) = x16 + 2x13 + 2x12 + 3x10 + 4x9 + x8 + 2x7 + 4x6 + 2x5 + x4 + 2x3 + x2. This step is O(n log n).

忽略,除​​了那些对应的 X 2K 的一段的 K 的在。对于以上的例子中,我们得到的条款的 X 16 ,3倍 10 中,x 8 中,x 4 中,x 2 的。这一步是O(n),如果你选择这样做的。

Ignore all terms except those corresponding to x2k for some k in L. For the example above, we get the terms x16, 3x10, x8, x4, x2. This step is O(n), if you choose to do it at all.

这里的关键之处:所有的 X 2B 的的的 B 的在 precisely 的对数的(A,C)的在这样的 A + C = 2B 的。 [CLRS,防爆。 30.1-7]一种这样的对是(B,B)的总(因此系数至少为1),但如果存在任何其它成对的(A,C)的,则系数至少为3,从(A,C)的和的(C,A)的。对于以上的例子中,我们有系数的 X 10 的是3 $ P $因为在AP(2,5,8)的pcisely。 (这些系数的 X 2b的功能 的永远是奇数,上述原因,而且在q中的所有其他系数将始终是偶数。)

Here's the crucial point: the coefficient of any x2b for b in L is precisely the number of pairs (a,c) in L such that a+c=2b. [CLRS, Ex. 30.1-7] One such pair is (b,b) always (so the coefficient is at least 1), but if there exists any other pair (a,c), then the coefficient is at least 3, from (a,c) and (c,a). For the example above, we have the coefficient of x10 to be 3 precisely because of the AP (2,5,8). (These coefficients x2b will always be odd numbers, for the reasons above. And all other coefficients in q will always be even.)

那么,该算法是看这些项的系数的 X 2B 的,看看其中是否大于1。如果没有,然后没有均匀间隔1秒。如果没有的<​​em>是的一个的 B 的在为其中的系数的 X 2B 的大于1,那么我们知道有一些对的(A,C)的 - 比其他的(B,B)的 - 对此的 A + C = 2B 的。要找到实际的对,我们简单地尝试每个的的在(相应的 C 的是 2B -a 的),看看是否有一个1位的 2B-A 的在取值。这一步是O(n)。

So then, the algorithm is to look at the coefficients of these terms x2b, and see if any of them is greater than 1. If there is none, then there are no evenly spaced 1s. If there is a b in L for which the coefficient of x2b is greater than 1, then we know that there is some pair (a,c) — other than (b,b) — for which a+c=2b. To find the actual pair, we simply try each a in L (the corresponding c would be 2b-a) and see if there is a 1 at position 2b-a in S. This step is O(n).

这一切,乡亲们。


也许有人会问:我们需要使用FFT?很多答案,如<一个href="http://stackoverflow.com/questions/1560523/onlogn-algorithm-find-three-evenly-spaced-ones-within-binary-string/1561827#1561827">beta's, <一href="http://stackoverflow.com/questions/1560523/onlogn-algorithm-find-three-evenly-spaced-ones-within-binary-string/1572080#1572080">flybywire's,和<一href="http://stackoverflow.com/questions/1560523/onlogn-algorithm-find-three-evenly-spaced-ones-within-binary-string/1567324#1567324">rsp's,建议,检查每对1秒,并认为如果有一个1在第三的位置,这种方法可能会为O工作(N log n)的基础上,直觉是,如果有太多的1S,我们会找到一个三重容易,如果有过少1秒,检查所有对需要一点时间。不幸的是,尽管这种直觉是正确的,简单的方法的的为O更好的(N 2 ),它不是显著更好。作为<一个href="http://stackoverflow.com/questions/1560523/onlogn-algorithm-find-three-evenly-spaced-ones-within-binary-string/1579165#1579165">sdcvvc's回答,我们可以采取康托般的一套长串的 N = 3 K 的,用ls在其三元重新$ P $的位置psentation只有在它0和2秒(无1秒)。这样的字符串有 2 K = N (日志2)/(日志3)≈ñ 0.63 的那些在它并没有均匀间隔1秒,所以检查所有对将是在它的1的数目的平方的数量级:这是 4 K ≈Ñ 1.26 的不幸是渐近远远大于(N log n)的。其实,最坏的情况甚至更糟:狮子座莫泽在1953年构建(有效的)这样的字符串其具有的 N 1-C /√(log n)的 的1秒在他们,但不均匀间隔1秒,这意味着,在这样的字符串,则简单的方法将采取的 Θ(N 2-2C /√(log n)的的 - 只有一个的微小的比好一点的Θ( ñ 2 的,令人惊讶的!

One might ask: do we need to use FFT? Many answers, such as beta's, flybywire's, and rsp's, suggest that the approach that checks each pair of 1s and sees if there is a 1 at the "third" position, might work in O(n log n), based on the intuition that if there are too many 1s, we would find a triple easily, and if there are too few 1s, checking all pairs takes little time. Unfortunately, while this intuition is correct and the simple approach is better than O(n2), it is not significantly better. As in sdcvvc's answer, we can take the "Cantor-like set" of strings of length n=3k, with 1s at the positions whose ternary representation has only 0s and 2s (no 1s) in it. Such a string has 2k = n(log 2)/(log 3) ≈ n0.63 ones in it and no evenly spaced 1s, so checking all pairs would be of the order of the square of the number of 1s in it: that's 4k ≈ n1.26 which unfortunately is asymptotically much larger than (n log n). In fact, the worst case is even worse: Leo Moser in 1953 constructed (effectively) such strings which have n1-c/√(log n) 1s in them but no evenly spaced 1s, which means that on such strings, the simple approach would take Θ(n2-2c/√(log n)) — only a tiny bit better than Θ(n2), surprisingly!


关于1秒的长度为n,没有3间隔均匀的那些串的最大数量(我们上面看到的是至少的 N 0.63 的从简单的Cantor-像建筑,以及至少 N 1-C /√(log n)的 的与莫泽的建设) - 这是的OEIS A003002 。它也可直接从 OEIS A065825 的计算为第k使得A065825(k)的≤N'其中; A065825第(k + 1)。我写了一个程序来找到这些,而且事实证明,贪心算法做的没有的给予最长的字符串。例如,对于的 N 的= 9,我们可以得到5 1S(110100011),但贪婪只给出4(110110000),为的 N = 26,我们可以得到11 1S( 11001010001000010110001101),但贪婪仅给出8(11011000011011000000000000),和用于 N 的= 74,我们可以得到22 1秒(11000010110001000001011010001000000000000000010001011010000010001101000011),但贪婪仅给出16(11011000011011000000000000011011000011011000000000000000000000000000000000)。他们不同意,在不少地方,直到50(例如,所有的38至50),虽然。由于OEIS引用说,似乎雅罗斯瓦夫·莱夫斯基很关心这个问题,他认为这些网站的非平均集的。具体的数字是已知的只到194。

About the maximum number of 1s in a string of length n with no 3 evenly spaced ones (which we saw above was at least n0.63 from the easy Cantor-like construction, and at least n1-c/√(log n) with Moser's construction) — this is OEIS A003002. It can also be calculated directly from OEIS A065825 as the k such that A065825(k) ≤ n < A065825(k+1). I wrote a program to find these, and it turns out that the greedy algorithm does not give the longest such string. For example, for n=9, we can get 5 1s (110100011) but the greedy gives only 4 (110110000), for n=26 we can get 11 1s (11001010001000010110001101) but the greedy gives only 8 (11011000011011000000000000), and for n=74 we can get 22 1s (11000010110001000001011010001000000000000000010001011010000010001101000011) but the greedy gives only 16 (11011000011011000000000000011011000011011000000000000000000000000000000000). They do agree at quite a few places until 50 (e.g. all of 38 to 50), though. As the OEIS references say, it seems that Jaroslaw Wroblewski is interested in this question, and he maintains a website on these non-averaging sets. The exact numbers are known only up to 194.

这篇关于O(nlogn)算法 - 查找二进制字符串中的三个均匀间隔的人的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆