有人能以一种非常简单的图形方式给出一个余弦相似度的例子吗? [英] Can someone give an example of cosine similarity, in a very simple, graphical way?

查看:19
本文介绍了有人能以一种非常简单的图形方式给出一个余弦相似度的例子吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

维基百科上的余弦相似性文章

你能在这里显示向量吗(在列表或其他东西中)然后计算一下,让我们看看它是如何工作的?

Can you show the vectors here (in a list or something) and then do the math, and let us see how it works?

推荐答案

这里有两个非常短的文本可供比较:

Here are two very short texts to compare:

  1. Julie 爱我胜过 Linda 爱我

简爱我胜过朱莉爱我

我们想知道这些文本有多相似,纯粹是在字数方面(并忽略词序).我们首先列出两个文本中的单词:

We want to know how similar these texts are, purely in terms of word counts (and ignoring word order). We begin by making a list of the words from both texts:

me Julie loves Linda than more likes Jane

现在我们计算这些单词在每个文本中出现的次数:

Now we count the number of times each of these words appears in each text:

   me   2   2
 Jane   0   1
Julie   1   1
Linda   1   0
likes   0   1
loves   2   1
 more   1   1
 than   1   1

虽然我们对单词本身不感兴趣.我们只对这两个垂直向量的计数.例如,有两个实例每个文本中的我".我们将决定这两个文本与每个文本的接近程度另一种是通过计算这两个向量的一个函数,即的余弦它们之间的角度.

We are not interested in the words themselves though. We are interested only in those two vertical vectors of counts. For instance, there are two instances of 'me' in each text. We are going to decide how close these two texts are to each other by calculating one function of those two vectors, namely the cosine of the angle between them.

这两个向量又是:

a: [2, 0, 1, 1, 0, 2, 1, 1]

b: [2, 1, 1, 0, 1, 1, 1, 1]

它们之间夹角的余弦值约为 0.822.

The cosine of the angle between them is about 0.822.

这些向量是 8 维的.使用余弦相似性的优点很明显它将一个超出人类想象能力的问题转化为一个那可以.在这种情况下,您可以将其视为大约 35 度的角度度数,与零或完全一致的距离".

These vectors are 8-dimensional. A virtue of using cosine similarity is clearly that it converts a question that is beyond human ability to visualise to one that can be. In this case you can think of this as the angle of about 35 degrees which is some 'distance' from zero or perfect agreement.

这篇关于有人能以一种非常简单的图形方式给出一个余弦相似度的例子吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆