该算法对查找字符串的周期是否正确? [英] Is this algorithm correct for finding period of a string?

查看:80
本文介绍了该算法对查找字符串的周期是否正确?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题的答案引用了Crochemore和Rytter的文本算法第340页中的线性时间算法计算字符串的周期。但是,它非常复杂,下面的代码是根据双向算法(由Chrochemore和Perrin编写)使用的最大后缀算法改编而成,对于计算周期似乎是正确的:

An answer to this question cites page 340 of "Text algorithms" by Crochemore and Rytter for a linear-time algorithm to compute the period of a string. However it's quite complex, and the following, adapted from the maximal suffix algorithm used in the Two Way algorithm (by Chrochemore and Perrin), seems correct for computing the period:

size_t period_of(const char *x)
{
    size_t j=1, k=0, p=1;
    while (x[j+k]) {
        if (x[j+k] != x[k]) {
            j += k?k:1;        // Previously: j += k+1;
            k = 0;
            p = j;
        } else if (k != p) {
            k++;
        } else {
            j += p;
            k = 0;
        }
    }
    return p;
}

采用双向的版本来计算周期最大后缀是计算最大后缀的副作用。但是,除非我丢失了某些东西,否则逻辑的有效性似乎并不取决于最大后缀属性。

Their version in Two Way, from which this is adapted, computes the period of the maximal suffix as a side effect of computing the maximal suffix. However, unless I'm missing something, the validity of the logic does not seem to depend on the maximal suffix property.

以上方法正确吗?如果不是,您是否可以提供一个显示失败之处的反例?

Is the above correct? If not, can you provide a counterexample that shows where it fails?

推荐答案

一个反例,即使解决方法是:

A counterexample even after the fix is:

aabaaaba

在位置3处,算法首先开始匹配期间3匹配的前缀。但是,当它在第5位获得 a 而不是 b 时,错误地将候选期跃升至5缺少实际的4周期。

At position 3, the algorithm first starts matching the prefix of a period-3 match. But when it gets an a instead of a b at position 5, it wrongly jumps the candidate period up to 5, missing the actual period of 4.

双向算法的改编自该算法,计算了最大后缀的周期作为寻找最大后缀的副作用后缀,确实依赖于最大后缀属性。代替!= 条件,它具有单独的> < 条件,其中两个之一将替换候选后缀的开头,另一个将延长运行时间。非严格地说,上述情况不会出现,因为 b> a ,在这种情况下,后缀将从 b a>开始。 b ,在这种情况下 aaa> aab 。我怀疑更详细地阅读该论文(这需要使用基于一个的索引来解密其可怕的伪代码表示法)才能使其余的内容变得清晰。

The algorithm in Two Way, from which this was adapted, that computes the period of the maximal suffix as a side effect of finding the maximal suffix, does rely on the maximal suffix property. Instead of the != condition, it has separate > and < conditions, where one of the two will replace the start of the candidate suffix and the other will extend the running period. Stated non-rigorously, the above kind of situation cannot arise, because either b > a, in which case the suffix would start at a b, or a > b, in which case aaa > aab. I suspect reading the paper in more detail (which requires deciphering its awful pseudocode notation with one-based indexing) will clarify the rest.

很遗憾,我相当相信我询问的算法不可恢复。

Unfortunately I'm fairly convinced that the algorithm I asked about is not recoverable.

另外请注意,在示例中, a 不必是单个字符。它可以是任意长的模式。这似乎排除了任何琐碎的线性时间修正。

Further note that in the example, a need not be a single character. It can be an arbitrarily-long pattern. This seems to preclude any trivial linear-time fixups.

这篇关于该算法对查找字符串的周期是否正确?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆