勘误表后缀阵列的原始论文? [英] Errata in the original paper on suffix arrays?

查看:145
本文介绍了勘误表后缀阵列的原始论文?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我期待在原皮引入后缀数组的后缀数组:上线字符串搜索的新方法

I'm looking at the pseudo-code given in figure 3 of the original paper introducing suffix arrays "SUFFIX ARRAYS: A NEW METHOD FOR ON-LINE STRING SEARCHES".

我想不通的行4和第5(索引从0)的逻辑。该行写着:

I can't figure out the logic for lines 4 and 5 (indexing from 0). The lines reads:

否则如果为r P 是W <子>研究≤一个<子> POS [N-1] + R
  →是W ←ñ

else if r < P or wr ≤ aPos[N-1]+r then
LW ← N

是W 是一个长度为P我们正在寻找的模式,研究 LCP(A [POS [N-1]:],W)。的问题是,在几乎所有的情况下,这 LCP 将小于W的长度。这个条件是为了处理这种情况(我想),该模式是字典顺序比阵列中的字典顺序最大后缀较大,但它不检查这在所有。线路2及3,在另一方面,它测试,如果是W 小于字典序最小的后缀似乎完全合理

W is the pattern of length 'P' we're looking for and r is lcp(A[pos[N-1]:], W). The problem is that in almost all cases, this lcp will be less than length of 'W'. This conditional is meant to handle the case (I think) that the pattern is lexicographically larger than the lexicographically greatest suffix in the array, but it doesn't test this at all. Lines 2 & 3, on the other hand, which test if W is less than the lexicographically smallest suffix seem to make perfect sense

如果 L = P 是W <子>→≤一个<子> POS [0] + 1 的然后
  →是W ←0

if l = P or wl ≤ aPos[0]+l then
LW ← 0

我认为,原来的线路应为这样的:

I believe that the original lines should read something like:

否则如果为r P 是W <子>研究>一<子> POS [N-1] + R
  →是W ←ñ

else if r < P and wr > aPos[N-1]+r then
LW ← N

的唯一方法是是W 可大于 A [POS [N-1]:] 是,如果它有一个 LCP 不是该模式的长度(短,否则,所有的是W 比赛等是W 不能大,只有较小的或等于与我们分享的事 LCP ),如果之后的字符 LCP 是更大是W A [POS [N-1] 。这个问题似乎有意义吗?这是在原纸上的错误?如果没有,可有人请向我解释我是如何misinter preting原来的code?

The only way that W can be greater than A[pos[N-1]:] is if it has an lcp shorter than the length of the pattern (otherwise, all of W matches and so W cannot be greater, only lesser or equal to the thing with which we're sharing the lcp) AND if the character after the lcp is greater in W than in A[pos[N-1]]. Does this seem to make sense? Is this an error in the original paper? If not, can someone please explain to me how I'm misinterpreting the original code?

推荐答案

我想你明白纸正确的,实际上它有一个错误。

I think you understand the paper correct and in fact it has an error.

请看下面的例子:让 A =香蕉 W =娜娜。然后 A [POS [N-1]:] =娜娜。算法集 LW = N 甚至是失败,而实际上它的 N-1

Consider the following example: let A = banana, W = nana. Then A[pos[N-1]:] = nana. Algorithm sets LW = N or even fails while in fact it's N-1.

这篇关于勘误表后缀阵列的原始论文?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆