字符串相似性得分/哈希 [英] String similarity score/hash

查看：177 发布时间：2015/11/30 14:15:36 algorithm hash similarity

本文介绍了字符串相似性得分/哈希的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

有没有一种方法来计算像一串一般的相似性得分？在我不是比较两个字符串连接在一起，而是我得到一些数字（哈希）对每个字符串，后来告诉我，两个字符串或不相似的方式。两个类似的字符串应该有类似（接近）哈希值。

Is there a method to calculate something like general "similarity score" of a string? In a way that I am not comparing two strings together but rather I get some number (hash) for each string that can later tell me that two strings are or are not similar. Two similar strings should have similar (close) hashes.

让我们考虑这些字符串和分数作为一个例子：

Let's consider these strings and scores as an example:

Hello world                1000
Hello world!               1010
Hello earth                1125
Foo bar                    3250
FooBarbar                  3750
Foo Bar!                   3300
Foo world!                 2350

您可以看到，世界，你好！和世界，你好是相似的，他们的分数都接近对方。

You can see that Hello world! and Hello world are similar and their scores are close to each other.

这种方式，找到最相似的字符串，以给定的字符串将被减去给出的字符串的得分来自对方得分，然后排序他们的绝对值来完成。

This way, finding the most similar strings to a given string would be done by subtracting given strings score from other scores and then sorting their absolute value.

字符串相似性得分/哈希 [英] String similarity score/hash

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

字符串相似性得分/哈希 [英] String similarity score/hash

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭