如何在相似性度量和差异(距离)度量之间转换? [英] How do I convert between a measure of similarity and a measure of difference (distance)?

查看:152
本文介绍了如何在相似性度量和差异(距离)度量之间转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一种通用的方法可以在相似度和距离之间转换?

Is there a general way to convert between a measure of similarity and a measure of distance?

考虑类似的度量,例如两个字符串共有的2克数.

Consider a similarity measure like the number of 2-grams that two strings have in common.

2-grams('beta', 'delta') = 1
2-grams('apple', 'dappled') = 4

如果我需要将其提供给期望差异度的优化算法(如Levenshtein距离)怎么办?

What if I need to feed this to an optimization algorithm that expects a measure of difference, like Levenshtein distance?

这只是一个例子...我正在寻找一种通用的解决方案,如果存在的话.像如何从Levenshtein距离达到相似度?

This is just an example...I'm looking for a general solution, if one exists. Like how to go from Levenshtein distance to a measure of similarity?

感谢您提供的任何指导.

I appreciate any guidance you may offer.

推荐答案

d 表示距离, s 表示相似性.要将距离量度转换为相似度量度,我们首先需要使用 d_norm = d /max( d ).然后通过以下方式给出相似性度量:

Let d denotes distance, s denotes similarity. To convert distance measure to similarity measure, we need to first normalize d to [0 1], by using d_norm = d/max(d). Then the similarity measure is given by:

s = 1- d_norm .

其中 s 的范围为[0 1],其中1表示相似度最高(比较项相同),0表示相似度最低(距离最大).

where s is in the range [0 1], with 1 denotes highest similarity (the items in comparison are identical), and 0 denotes lowest similarity (largest distance).

这篇关于如何在相似性度量和差异(距离)度量之间转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆