"部分匹配"表(又名"故障功能和QUOT;)在KMP(维基百科) [英] "Partial match" table (aka "failure function") in KMP (on wikipedia)
问题描述
我在读的 KMP算法在维基百科。有一行code。在说明伪$ C $下表构建算法一节混淆我:让来电显示←T [来电显示]
它有一个注释:(第二种情况:没有,但我们可以退回)
,我知道我们能回落,但为什么T [来电显示] ,是有什么原因?因为它确实混淆了我。
下面是完整的伪code FOT表构建算法:
算法kmp_table:
输入:
字符数组,W(单词待分析)
整数数组T(表格填写)
输出:
什么(但运行过程中,其填充表)
定义变量:
的整数,POS←2(我们计算T中的当前位置)
一个整数,来电显示←0(零基指数的下一页W
当前候选串的字符)
(前几个值是固定的,但是从什么算法不同
可能会建议)
让T [0]←-1,T [1]←0
而POS<长度(W)办
(第一种情况:子串继续)
如果W [POS - 1] = W [来电显示],然后
让来电显示←CND + 1,T [POS]←来电显示,POS←POS + 1
(第二种情况:没有,但我们可以退回)
否则,如果来电显示> 0,则
让来电显示←T [来电显示]
(第三种情况:我们已经用完了考生注意CND = 0)
其他
让T [POS]←0,POS←POS + 1
您可以回落到 T [来电显示]
,因为它包含的previous长度最长真$ P $格局PFIX 是W 这也是适当的后缀W [0 ...来电显示]
。因此,如果当前字符 W [POS-1]
在 W [T [来电显示]
匹配字符,你可以扩展的最长真preFIX长度W [0 ... POS-1]
(这是第一种情况)。
我想这有点像,你要靠previously计算值的动态规划。
这 解释可能会帮助你。
I'm reading the KMP algorithm on wikipedia. There is one line of code in the "Description of pseudocode for the table-building algorithm" section that confuses me: let cnd ← T[cnd]
It has a comment: (second case: it doesn't, but we can fall back)
, I know we can fall back, but why T[cnd], is there a reason? Because it really confuses me.
Here is the complete pseudocode fot the table-building algorithm:
algorithm kmp_table:
input:
an array of characters, W (the word to be analyzed)
an array of integers, T (the table to be filled)
output:
nothing (but during operation, it populates the table)
define variables:
an integer, pos ← 2 (the current position we are computing in T)
an integer, cnd ← 0 (the zero-based index in W of the next
character of the current candidate substring)
(the first few values are fixed but different from what the algorithm
might suggest)
let T[0] ← -1, T[1] ← 0
while pos < length(W) do
(first case: the substring continues)
if W[pos - 1] = W[cnd] then
let cnd ← cnd + 1, T[pos] ← cnd, pos ← pos + 1
(second case: it doesn't, but we can fall back)
else if cnd > 0 then
let cnd ← T[cnd]
(third case: we have run out of candidates. Note cnd = 0)
else
let T[pos] ← 0, pos ← pos + 1
You can fall back to T[cnd]
because it contains the length of the previous longest proper prefix of the pattern W which is also the proper suffix of W[0...cnd]
. So if the current character at W[pos-1]
matches the character at W[T[cnd]]
, you may extend the length of longest proper prefix of W[0...pos-1]
(which is the first case).
I guess it's kind of like dynamic programming where you rely on previously computed values.
This explanation might help you.
这篇关于&QUOT;部分匹配&QUOT;表(又名&QUOT;故障功能和QUOT;)在KMP(维基百科)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!