Haskell中的Knuth-Morris-Pratt算法 [英] Knuth-Morris-Pratt algorithm in Haskell

查看:211
本文介绍了Haskell中的Knuth-Morris-Pratt算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在理解Haskell中的Knuth-Morris-Pratt算法的实现方面遇到麻烦。



http://twanvl.nl/blog/haskell/Knuth-Morris-Pratt-in-Haskell



特别是我不了解自动机的构造。我知道它使用绑结方法来构建它,但它并不清楚,我也不知道为什么它应该有正确的复杂性。


$ b $另一件我想知道的是,你是否认为这个实现可以很容易地推广到实现Aho-Corasick算法。



感谢您的回答!

解决方案

这是算法:

  makeTable :: Eq a => [a]  - > KMP a 
makeTable xs = table
其中table = makeTable'xs(const table)
$ b makeTable'[] failure = KMP True failure
makeTable'(x: xs)failure = KMP False test
其中test c = if c == x then success else else c
success = makeTable'xs(next(failure x))

使用这个,我们来看看为shoeshop构造的表格:

  makeTableshoeshop= table0 

table0 = makeTable'shoeshop(const table0)
= KMP False test0

test0 c = if c =='s'then success1 else const table0 c
= if c =='s'then success1 else table0

success1 = makeTable'hoeshop(next(const table0's'))
= makeTable''hoeshop'(next table0)
= makeTable''hoeshop'test0
= KMP False test1

test1 c = if c =='h'then success2 else test0 c

success2 = makeTable'oeshop (next(test0'h'))
= makeTable'oeshop(next table0)
= makeTable'oeshoptest0
= makeTable'oeshoptest0
= KMP False test2

test2 c = if c =='o'then success3 else test0 c

success3 = makeTable'eshop(next(test0'o') )
= makeTable'eshop(next table0)
= makeTable'eshoptest0
= KMP False test3

test3 c = if c ==' e'then success4 else test0 c

success4 = makeTable'shop(next(test0'e'))
= makeTable'shop(next table0)
= makeTable''shop'test0
= KMP False test4

test4 c = if c =='s'then success5 else test0 c

success5 = makeTable' (下一个(test0's'))
= makeTable'hop(next success1)
= makeTable'hoptest1
= KMP False test5

test5 c = if c =='h'then success6 else test1 c

success6 = makeTable'op(next(test1'h'))
= makeTable'op(next success2)
= makeTable' optest2
= KMP False test6

test6 c = if c =='o'then success7 else test2 c

success7 = makeTable'p(下一个(test2'o'))
= makeTable'p(next success3)
= makeTable'ptest3
= KMP False test7

test7 c = if c =='p'then success8 else test3 c

success8 = makeTable'(next(test3'p'))
= makeTable'(next(test0 'p'))
= makeTable'(next table0)
= makeTable'test0
= KMP True test0
success5 如何使用消耗的's'来回溯模式的初始's'。



现在仔细检查 isSubstringOf2shoeshop$ cycleshoe时发生的情况。



看到 test7 无法匹配'p'时,它会回落到 test3 尝试匹配'e',以便我们循环 success4 success5 success6 success7 ad infinitum。


I have a trouble with understanding this implementation of the Knuth-Morris-Pratt algorithm in Haskell.

http://twanvl.nl/blog/haskell/Knuth-Morris-Pratt-in-Haskell

In particular I don't understand the construction of the automaton. I know that it uses the "Tying the Knot" method to construct it, but it isn't clear to me and I also don't know why it should have the right complexity.

Another thing I would like to know is whether you think that this implementation could be easily generalized to implement the Aho-Corasick algorithm.

Thanks for your answers!

解决方案

So here's the algorithm:

makeTable :: Eq a => [a] -> KMP a
makeTable xs = table
   where table = makeTable' xs (const table)

makeTable' []     failure = KMP True failure
makeTable' (x:xs) failure = KMP False test
   where  test  c = if c == x then success else failure c
          success = makeTable' xs (next (failure x))

Using that, let's look at the table constructed for "shoeshop":

makeTable "shoeshop" = table0

table0 = makeTable' "shoeshop" (const table0)
       = KMP False test0

test0 c = if c == 's' then success1 else const table0 c
        = if c == 's' then success1 else table0

success1 = makeTable' "hoeshop" (next (const table0 's'))
         = makeTable' "hoeshop" (next table0)
         = makeTable' "hoeshop" test0
         = KMP False test1

test1 c = if c == 'h' then success2 else test0 c

success2 = makeTable' "oeshop" (next (test0 'h'))
         = makeTable' "oeshop" (next table0)
         = makeTable' "oeshop" test0
         = makeTable' "oeshop" test0
         = KMP False test2

test2 c = if c == 'o' then success3 else test0 c

success3 = makeTable' "eshop" (next (test0 'o'))
         = makeTable' "eshop" (next table0)
         = makeTable' "eshop" test0
         = KMP False test3

test3 c = if c == 'e' then success4 else test0 c

success4 = makeTable' "shop" (next (test0 'e'))
         = makeTable' "shop" (next table0)
         = makeTable' "shop" test0
         = KMP False test4

test4 c = if c == 's' then success5 else test0 c

success5 = makeTable' "hop" (next (test0 's'))
         = makeTable' "hop" (next success1)
         = makeTable' "hop" test1
         = KMP False test5

test5 c = if c == 'h' then success6 else test1 c

success6 = makeTable' "op" (next (test1 'h'))
         = makeTable' "op" (next success2)
         = makeTable' "op" test2
         = KMP False test6

test6 c = if c == 'o' then success7 else test2 c

success7 = makeTable' "p" (next (test2 'o'))
         = makeTable' "p" (next success3)
         = makeTable' "p" test3
         = KMP False test7

test7 c = if c == 'p' then success8 else test3 c

success8 = makeTable' "" (next (test3 'p'))
         = makeTable' "" (next (test0 'p'))
         = makeTable' "" (next table0)
         = makeTable' "" test0
         = KMP True test0

Note how success5 uses the consumed 's' to retrace the initial 's' of the pattern.

Now walk through what happens when you do isSubstringOf2 "shoeshop" $ cycle "shoe".

See that when test7 fails to match 'p', it falls back to test3 to try to match 'e', so that we cycle through success4, success5, success6 and success7 ad infinitum.

这篇关于Haskell中的Knuth-Morris-Pratt算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆