最好的字符串搜索算法 [英] Best string search algorithm around

查看:157
本文介绍了最好的字符串搜索算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个代码,在其中我比较一个大数据,例如一个网页的源文件中的某些单词。



可以有两种情况:


  1. 如果我有大量的词与源进行比较,在这种情况下,对于正常的字符串搜索算法,它将需要一个字,与数据进行比较,取下一个,并与数据等等,直到所有完成。


  2. 我在文件中只有几个字,正常的字符串搜索会正常,但仍然需要


什么算法最好?我知道Boyer-Moore和Rabin-Karp搜索算法。
虽然Boyer-Moore的搜索似乎很快,但我也喜欢其他算法的名称和比较。

解决方案

两种情况下,我想你可能想要构建一个patricia trie(也称为基数树)。最重要的是,查找时间将是O(k),其中k是字符串中字符串的最大长度。


I have a code where in i compare a large data, say a source of a web page against some words in a file. What is the best algorithm to be used?

There can be 2 scenarios:

  1. If I have a large amount of words to compare against the source, In which case, for a normal string search algorithm, it would have to take a word, compare against the data, take the next and compare against the data and so on until all is complete.

  2. I have only a couple of words in the file and the normal string search would be ok, but still need to reduce the time as much as possible.

What algorithm is best? I know about Boyer-Moore and also Rabin-Karp search algorithms. Although Boyer-Moore search seems fast, I would also like names of other algorithms and their comparisons.

解决方案

In both cases, I think you probably want to construct a patricia trie (also called radix tree). Most importantly, lookup time would be O(k), where k is the max length of a string in the trie.

这篇关于最好的字符串搜索算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆