如何找到类似的两个字符串 [英] Finding how similar two strings are

查看：119 发布时间：2015/11/30 13:40:28 algorithm string-matching

本文介绍了如何找到类似的两个字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在寻找一种算法，需要2串并给我回一个相似系数。

I'm looking for an algorithm that takes 2 strings and will give me back a "factor of similarity".

基本上，我将有一个输入可能拼写错误，所信换位等优点，而且我一定要找到最接近的匹配（ES）的，我有可能值的列表。

Basically, I will have an input that may be misspelled, have letters transposed, etc, and I have to find the closest match(es) in a list of possible values that I have.

这是不用于搜索在数据库中。我有500个左右的字符串在内存中的列表来匹配，在全部30个字符，所以可以相对较慢。

This is not for searching in a database. I'll have an in-memory list of 500 or so strings to match against, all under 30 chars, so it can be relatively slow.

我知道这存在，我以前见过它，但我不记得它的名字。

I know this exists, i've seen it before, but I can't remember its name.

编辑：感谢您指出莱文斯坦和海明。现在，我应该实现哪一个？他们基本上是衡量不同的东西，这两者都可以用来做什么我想，但我不知道哪一个是比较合适的。

Thanks for pointing out Levenshtein and Hamming. Now, which one should I implement? They basically measure different things, both of which can be used for what I want, but I'm not sure which one is more appropriate.

我的算法读了，海明似乎明显加快。由于没有将检测两个字符被调换（即约旦和Jodran），我相信这将是一个常见的错误，这将是更准确的我想要什么？谁能告诉我一些关于取舍？

I've read up on the algorithms, Hamming seems obviously faster. Since neither will detect two characters being transposed (ie. Jordan and Jodran), which I believe will be a common mistake, which will be more accurate for what I want? Can someone tell me a bit about the trade-offs?

如何找到类似的两个字符串 [英] Finding how similar two strings are

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

如何找到类似的两个字符串 [英] Finding how similar two strings are

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭