快速近似字符串匹配算法 [英] algorithms for fast string approximate matching

查看:200
本文介绍了快速近似字符串匹配算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于源字符串取值 N 相等长度的字符串,我需要找到一个快速的算法来回报那些有至多 K ,可从每个相应的位置上源字符串取值不同的字符的字符串。

Given a source string s and n equal length strings, I need to find a quick algorithm to return those strings that have at most k characters that are different from the source string s at each corresponding position.

什么是快速算法来做到这一点?

What is a fast algorithm to do so?

PS:我有要求,这是一个学术的问题。我想找到最有效的算法,如果可能的话。

PS: I have to claim that this is a academic question. I want to find the most efficient algorithm if possible.

此外,我错过了一个信息非常重要的一块。该 N 相等长度的字符串形成一本字典,对其中许多源字符串取值会被人质疑。似乎有某种preprocessing步骤,使之更有效率。

Also I missed one very important piece of information. The n equal length strings form a dictionary, against which many source strings s will be queried upon. There seems to be some sort of preprocessing step to make it more efficient.

推荐答案

塞奇威克在他的著作算法写道:的三元搜索树允许找到一个给定的海明内的所有的话距离查询词的。在道博博士的 文章

Sedgewick in his book "Algorithms" writes that Ternary Search Tree allows "to locate all words within a given Hamming distance of a query word". Article in Dr. Dobb's

这篇关于快速近似字符串匹配算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆