您可以使用哪些算法来确定字符串中的重复的语句? [英] What algorithm can you use to find duplicate phrases in a string?
问题描述
给定一个任意字符串,什么是查找重复的短语的一种有效的方法?我们可以说,词组必须长于要包含一定长度。
Given an arbitrary string, what is an efficient method of finding duplicate phrases? We can say that phrases must be longer than a certain length to be included.
在理想情况下,你最终会与出现的每个短语的数量。
Ideally, you would end up with the number of occurrences for each phrase.
推荐答案
像早期的乡亲提到的后缀树是适合这份工作的最佳工具。我最喜欢的网站后缀树是 http://www.allisons.org/ll/AlgDS /树/后缀/ ..它列举的后缀树全部漂亮的用途在一个页面上,并有测试JS应用程序的嵌入式测试串并通过实例工作。
Like the earlier folks mention that suffix tree is the best tool for the job. My favorite site for suffix trees is http://www.allisons.org/ll/AlgDS/Tree/Suffix/ .. it enumerates all the nifty uses of a suffix trees on one page and has test js app embedded to test strings and work through examples.
这篇关于您可以使用哪些算法来确定字符串中的重复的语句?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!