最长的二进制序列,没有相等的n长度子序列 [英] Longest binary sequence with no equal n-length subsequences

查看:121
本文介绍了最长的二进制序列,没有相等的n长度子序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在寻找一种符合以下条件的算法。

We are looking for an algorithm with the following criteria.

输入是任意正整数( n ),表示比较子序列的长度。

Input is an arbitrary positive integer (n), that represents the length of the compare subsequences.

我们搜索最长的二进制序列,该序列不包含相等的n长度子序列。匹配的相等序列可以重叠(当匹配项必须不相交时,这也是一个有趣的问题)。输出将是此位序列。

We search the longest binary sequence, which contains no equal n-length subsequences. Matched equal sequences can be overlapped (also an interesting problem when matches must be disjoint). Output will be this sequence of bits.

例如,如果 n = 3

10111010 无效,因为重复的 101 子序列。 01010 也是无效的,因为多次出现 010 01101001 是有效的,但显然不是最长的序列。

10111010 is invalid because of the repeating 101 subsequences. 01010 is also invalid because of multiple occurrences of 010. 01101001 is valid, but evidently not the longest possible sequence.

推荐答案

谷歌搜索二进制De Bruijn序列算法,我发现了这一点,您实际上可以分辨出发生了什么。它被称为 FKM算法(在Fredricksen,Kessler和Maiorana之后),它使用项链前缀方法找到了字典上最小的De Bruijn序列。我将使用n = 4的示例进行说明。

Googling for binary De Bruijn sequence algorithms, I found this one where you can actually tell what's happening. Known as the "FKM algorithm" (after Fredricksen, Kessler and Maiorana), it finds the lexicographically least De Bruijn sequence using the "necklace prefix" method. I'll explain using the example with n=4.

首先,创建长度为n的所有二进制序列,即从0到2 n -1的所有数字:

First, create all binary sequences of length n, i.e. all numbers from 0 to 2n-1:

0000,0001,0010,0011,0100,0101,0110,0111,1000,1001,1010,
1011,1100,1101,1110,1111

0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111

然后,删除不是旋转最低的序列,例如可以将 0110 旋转为较小的 0011

Then, remove the sequences which are not in their lowest rotation, e.g. 0110 can be rotated to 0011 which is smaller:


0000,0001,0011,0101,0111,1111

0000, 0001, 0011, 0101, 0111, 1111

(您会注意到这会删除所有除 0000 以外的偶数,以及所有大于 0111 的数,除了 1111 ,这有助于简化代码。)

(You'll notice that this removes a.o. all even numbers except 0000, and all numbers greater than 0111 except 1111, which helps to simplify code.)

然后将序列减少为非周期性前缀,即如果它们是重复较短的序列,请使用较短的序列;例如 0101 01 的重复, 1111 是重复的的 1

Then reduce the sequences to their "aperiodic prefix", i.e. if they are a repetition of a shorter sequence, use that shorter sequence; e.g. 0101 is a repetition of 01, 1111 is a repetition of 1:


0,0001,0011,01,0111,1

0, 0001, 0011, 01, 0111, 1

加入序列,您将获得De Bruijn序列:

Join the sequences, and you have a De Bruijn sequence:


0000100110101111

0000100110101111

对于非循环序列,请添加n-1个零:

For a non-circular sequence, add n-1 zeros:


0000100110101111000

0000100110101111000

(更多信息: F。Ruskey,J。Sawada,A。Williams:固定权重二进制的De Bruijn序列字符串 和B.史蒂文斯,A。威廉姆斯:二进制字符串的最酷顺序,摘自:算法的乐趣,2012年,第327-328页)

(further information: F. Ruskey, J. Sawada, A. Williams: "De Bruijn Sequences for Fixed-Weight Binary Strings" and B. Stevens, A. Williams: "The Coolest Order Of Binary Strings", from: "Fun With Algorithms", 2012, pp. 327-328)

我很想知道FKM与其他算法相比的性能如何,所以我写了这个相当笨拙的JavaScript实现。它确实快得多,并且在一秒钟之内生成了N = 20的1,048,595位数字序列。用严肃的语言,这应该很快。

I was curious to see how FKM performed compared to my other algorithm, so I wrote this rather clumsy JavaScript implementation. It is indeed much faster, and generates the 1,048,595 digit sequence for N=20 in under a second. In a serious language this should be very fast.

function DeBruijnFKM(n) {
    var seq = "0";                                         // start with 0 precalculated
    for (var i = 1; i < n; i++) {                      // i = number of significant bits
        var zeros = "", max = Math.pow(2, i);
        for (var j = n; j > i; j--) zeros += "0";                   // n-i leading zeros
        for (var k = i > 1 ? max / 2 + 1 : 1; k < max; k += 2) {     // odd numbers only
            var bin = k.toString(2);                           // bin = significant bits
            if (isSmallestRotation(zeros, bin)) {
                seq += aperiodicPrefix(zeros, bin);
            }
        }
    }
    return seq + Math.pow(2, n - 1).toString(2);      // append 2^N-1 and trailing zeros

    function isSmallestRotation(zeros, bin) {
        var len = 0, pos = 1;   // len = number of consecutive zeros in significant bits
        for (var i = 1; i < bin.length; i++) {
            if (bin.charAt(i) == "1") {
                if (len > zeros.length) return false;   // more zeros than leading zeros
                if (len == zeros.length
                && zeros + bin > bin.substr(pos) + zeros + bin.substr(0, pos)) {
                    return false;                              // smaller rotation found
                }
                len = 0;
                pos = i + 1;
            }
            else ++len;
        }
        return true;
    }

    function aperiodicPrefix(zeros, bin) {
        if (zeros.length >= bin.length) return zeros + bin;    // too many leading zeros
        bin = zeros + bin;
        for (var i = 2; i <= bin.length / 2; i++) {  // skip 1; not used for 0 and 2^N-1
            if (bin.length % i) continue;
            var pre = bin.substr(0, i);                      // pre = prefix of length i
            for (var j = i; j < bin.length; j += i) {
                if (pre != bin.substr(j, i)) break;              // non-equal part found
            }
            if (j == bin.length) return pre;                      // all parts are equal
        }
        return bin;                                               // no repetition found
    }
}

document.write(DeBruijnFKM(10));

这篇关于最长的二进制序列,没有相等的n长度子序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆