KMP算法中使用的故障功能如何工作? [英] How does the Failure function used in KMP algorithm work?

查看:116
本文介绍了KMP算法中使用的故障功能如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尽力阅读了大多数文献,但对于KMP算法中使用的故障函数的构造方式仍然一无所知。我主要指的是 http://community.topcoder。 com / tc?module = Static& d1 = tutorials& d2 = stringSearching 教程,大多数人都认为它很棒。但是,我仍然不了解。如果您能给我一个更简单易懂的解释,我将不胜感激。

I've tried my best reading most of the literature on this, and still haven't understood anything about how the failure function used in KMP algorithm is constructed. I've been referring mostly to http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=stringSearching tutorial which most of the people consider excellent. However, I still have not understood it. I'd be thankful if you could take the pain of giving me a simpler and easy to understand explanation on it.

推荐答案

失败函数实际上告诉我们:如果您匹配一个字符串的X个字符,那么该字符串的最长后缀是什么,使得它也是搜索字符串的前缀。

The failure function actually tells us this: if you matched X characters of a string, what is the longest suffix of such string such that it's also a prefix of a search string.

您在问它的构建方式,方法非常简单。

You are asking how it's built, the approach is quite straightforward.

如果在字符串的末尾,即您正在构建f [x],如果它与位置f [x-1]处的字符匹配,则f [x]就是f [x-1] +1。

If you add a new character at the end of a string, that is you are building f[x], and if it matches with character at position f[x-1], then f[x] is simply f[x-1]+1.

在其他不匹配的情况下,您尝试查找越来越小的后缀并检查它们是否匹配。

In the other cases where it doesn't match, you try to find smaller and smaller suffixes and check if they match.

例如,您有一个单词 accadaccac ,正在为其构建故障函数,而您只添加了字母'c'。假设您正在为最后一个字母,即字母'c'构建故障函数。

For example, you have a word "accadaccac" for which you are building a failure function and you just added the letter 'c'. Let's say you are building a failure function for the last letter, the letter 'c'.


  • 首先检查前一个字母的失败函数,因为它可以匹配后缀 acca 和前缀,所以它的失败函数为4 acca ,现在您添加字母'c',它与字母'd不匹配'后继前缀 acca

  • 所以您回溯到最后一个后缀。您现在正在搜索后缀 acca ,它也是 accadaccac 的前缀而不是 acca。该问题的答案是f [length( acca)-1]或f [3],即f [3] = 1,因为后缀的长度为1(只是字母'a ')也是搜索字符串的前缀。

  • 现在您可以尝试使用'c'与位置1上的字符匹配,瞧,它匹配了,所以现在您知道f [9] = f [f [8] -1] +1 = 2。

  • First you check the failure function of the previous letter, its failure function was 4 because you can match suffix "acca" with the prefix "acca", and now you add the letter 'c', it doesn't match with the letter 'd' succeeding prefix "acca".
  • So you backtrack, to the last good suffix. You are now searching for a suffix of "acca" which is also a prefix of "accadaccac", but is smaller than "acca". The answer to that question is f[length("acca")-1], or f[3], which is f[3] = 1, because suffix of length 1 (just the letter 'a') is also a prefix of a search string.
  • And now you can try if the 'c' matches with the character on the position 1, and voila, it matches, so now you know f[9] = f[f[8]-1]+1 = 2.

我希望这会对您有所帮助。祝好运! :)

I hope this will help you. Good luck! :)

这篇关于KMP算法中使用的故障功能如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆