查找两个数组之间的余弦相似度 [英] Find cosine similarity between two arrays

查看:520
本文介绍了查找两个数组之间的余弦相似度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道R中是否有一个内置函数可以找到两个数组之间的余弦相似度(或余弦距离)?

I'm wondering if there is a built in function in R that can find the cosine similarity (or cosine distance) between two arrays?

当前,我实现了自己的功能,但我不禁认为R应该已经附带了一个功能.

Currently, I implemented my own function, but I can't help but think that R should already come with one.

推荐答案

这类问题一直无休止地出现(对我来说-由带有 r 标签的SO问题列表证明, -其他):

These sort of questions come up all the time (for me--and as evidenced by the r-tagged SO question list--others as well):

在R核或任何R包中是否都具有x的功能?

在CRAN的+2000 R软件包中哪里可以找到它?

简短答案:出现此类问题时,请尝试 sos软件包

short answer: give the sos package a try when these sort of questions come up

一个较早的答案给出了 cosine 以及指向其帮助页面的链接.这可能正是OP想要的.当您查看链接到的页面时,您会发现此功能在 lsa 软件包中.

One of the earlier answers gave cosine along with a link to its help page. This is probably exactly what the OP wants. When you look at the linked-to page you see that this function is in the lsa package.

但是 如果您不知道要在哪个Package中查找它,您将如何找到该函数?

But how would you find this function if you didn't already know which Package to look for it in?

您始终可以尝试使用标准的R帮助功能(以下的>"仅表示R命令行):

you can always try the standard R help functions (">" below just means the R command line):

> ?<some_name>

> ??<some_name>

> *apropos*<some_name>

如果这些操作失败,则安装&加载 sos 程序包,然后

if these fail, then install & load the sos package, then

***findFn***

findFn 也别名为"???",尽管我不经常使用它,因为我认为您不能传入函数名称以外的参数

findFn is also aliased to "???", though i don't often use that because i don't think you can pass in arguments other than the function name

对于此处的问题,请尝试以下操作:

for the question here, try this:

> library(sos)

> findFn("cosine", maxPages=2, sortby="MaxScore")

传入的其他参数("maxPages = 2"和"sortby ="MaxScore")仅限制返回的结果数,并分别指定结果的排名方式,即查找一个名为'cosine的函数'或在功能说明中带有术语'余弦'的结果,仅返回两页结果,并按相关性得分递减的顺序对其进行排序".

The additional arguments passed in ("maxPages=2" and "sortby="MaxScore") just limits the number of results returned, and specifies how the results are ranked, respectively--ie, "find a function named 'cosine' or that has the term 'cosine' in the function description, only return two pages of results, and order them by descending relevance score"

上面的 findFn 调用返回一个数据帧,该数据帧包含9列,结果为行-呈现为HTML.

The findFn call above returns a data frame with nine columns and the results as rows--rendered as HTML.

扫描最后一列描述和链接,您找到的项目(第21行):

Scanning the last column, Description and Link, item (row) 21 you find:

余弦量度(矩阵)

此文本也是链接;单击它会将您带到包含该功能的程序包中该功能的帮助页面,换句话说

this text is also a link; clicking on it takes you to the help page for that function in the Package which contains that function--in other words

使用 findFn ,即使您不知道它在哪个Package中,也可以很快找到所需的功能

using findFn, you can pretty quickly find the function you want even though you have no idea which Package it's in

这篇关于查找两个数组之间的余弦相似度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆