如何查找与给定字符串在给定编辑距离处的所有字符串 [英] How to find all strings at a given edit distance from a given string

查看:84
本文介绍了如何查找与给定字符串在给定编辑距离处的所有字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们都已经在Google中看到,如果我们键入查询并输入错误,Google会建议使用更精明的查询版本(通常更正确)。现在他们怎么做?我可以想到的一种可能方法是,找出所有与给定字符串的编辑距离为1的其他字符串,如果其中任何一个返回的字符串具有更高的'searched'属性值(可能来自后端数据库,其中每个索引查询词的权重取决于该词在查询中出现的频率比给定的字符串多,因此建议使用该字符串。如果未找到,则搜索编辑距离为2的字符串,依此类推,直到SE判断为5为止,该字符串用户正在寻找的字符串,并返回相应的搜索结果。

We all have seen in Google, that if we type a query, and make a typo, Google suggests a saner version of the query (which is correct more often than not). Now how do they do it? One possible way I can think of is find out all other strings at an edit distance of 1 from the given string, and if any on of them returns a string with a higher value 'searched` attribute (might come from back-end DB, where each indexed query term has a weight associated with it based on how frequently that term crops up in queries) than the given string, that string is suggested. If none are found, then strings with an edit distance of 2 are searched, and so on, until, say at 5, the SE decides that may be this string is the one the user is looking for, and returns the corresponding search results.

现在是否可以在距给定字符串给定编辑距离处找到字符串?这个过程的效率如何?有没有很酷的算法可以做到这一点?

Now is it possible at all to find strings at a given edit distance from a given string? How efficient would that be for this process? Is there any cool algorithm to do this?

推荐答案

彼得·诺维格(Peter Norvig)有一篇有趣的文章 如何编写拼写校正器,讨论你的意思可能如何工作

There is interesting article of Peter Norvig "How to Write a Spelling Corrector" talking about how "Do you mean" might work

这篇关于如何查找与给定字符串在给定编辑距离处的所有字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆