有人可以用非常简单的图形方式给出余弦相似度的例子吗? [英] Can someone give an example of cosine similarity, in a very simple, graphical way?

查看:84
本文介绍了有人可以用非常简单的图形方式给出余弦相似度的例子吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

维基百科上的余弦相似度文章

您可以在此处(列表或其他形式)显示矢量b,然后进行数学运算,让我们看看它的工作原理吗?

Can you show the vectors here (in a list or something) and then do the math, and let us see how it works?

我m是初学者。

推荐答案

下面是两个比较短的文字:

Here are two very short texts to compare:


  1. 朱莉爱我胜过琳达爱我

简(Jane)喜欢我胜过朱莉(Julie)爱我

我们想知道这些文本有多么相似,仅就字数而言(并忽略字序)。我们首先列出两个文本中的单词:

We want to know how similar these texts are, purely in terms of word counts (and ignoring word order). We begin by making a list of the words from both texts:

me Julie loves Linda than more likes Jane

现在,我们计算这些单词在每个文本中出现的次数:

Now we count the number of times each of these words appears in each text:

   me   2   2
 Jane   0   1
Julie   1   1
Linda   1   0
likes   0   1
loves   2   1
 more   1   1
 than   1   1

我们对话本身。我们只对
这两个垂直计数向量感兴趣。例如,每个文本中有两个
me的实例。我们将通过计算这两个向量的一个函数(即它们之间的夹角
的余弦)来确定这两个文本彼此之间的距离。

We are not interested in the words themselves though. We are interested only in those two vertical vectors of counts. For instance, there are two instances of 'me' in each text. We are going to decide how close these two texts are to each other by calculating one function of those two vectors, namely the cosine of the angle between them.

这两个向量分别是:

a: [2, 0, 1, 1, 0, 2, 1, 1]

b: [2, 1, 1, 0, 1, 1, 1, 1]

它们之间夹角的余弦约为0.822。

The cosine of the angle between them is about 0.822.

这些向量是8尺寸。使用余弦相似度的优点显然是
,它将一个人类无法想象的问题转换成一个
的问题。在这种情况下,您可以将其视为大约35
度的角度,该角度与零或完美协议相距某个距离。

These vectors are 8-dimensional. A virtue of using cosine similarity is clearly that it converts a question that is beyond human ability to visualise to one that can be. In this case you can think of this as the angle of about 35 degrees which is some 'distance' from zero or perfect agreement.

这篇关于有人可以用非常简单的图形方式给出余弦相似度的例子吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆