查找两个数组之间的余弦相似度 [英] Find cosine similarity between two arrays

查看:44
本文介绍了查找两个数组之间的余弦相似度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道 R 中是否有一个内置函数可以找到两个数组之间的余弦相似度(或余弦距离)?

I'm wondering if there is a built in function in R that can find the cosine similarity (or cosine distance) between two arrays?

目前,我实现了自己的功能,但我不禁想到R应该已经自带了.

Currently, I implemented my own function, but I can't help but think that R should already come with one.

推荐答案

这类问题一直在出现(对我来说——正如 r 标记的 SO 问题列表所证明的那样——- 其他人):

These sort of questions come up all the time (for me--and as evidenced by the r-tagged SO question list--others as well):

在 R 核心或任何 R 包中是否有一个函数可以实现 x?,如果有,

在 CRAN 中的 +2000 R 包中,我在哪里可以找到它?

简短回答:当出现此类问题时,试试 sos 包

short answer: give the sos package a try when these sort of questions come up

较早的答案之一给出了 cosine 及其帮助页面的链接.这可能正是 OP 想要的.当您查看链接到的页面时,您会看到此函数位于 lsa 包中.

One of the earlier answers gave cosine along with a link to its help page. This is probably exactly what the OP wants. When you look at the linked-to page you see that this function is in the lsa package.

但是如果您还不知道要在哪个包中查找该函数,您将如何找到它?

But how would you find this function if you didn't already know which Package to look for it in?

您可以随时尝试标准的 R 帮助功能(>"下面仅表示 R 命令行):

you can always try the standard R help functions (">" below just means the R command line):

> ?<some_name>

> ??<some_name>

> *apropos*<some_name>

如果这些都失败了,那么安装 &加载 sos 包,然后

if these fail, then install & load the sos package, then

***findFn***

findFn 也别名为???",虽然我不经常使用它,因为我认为除了函数名之外你不能传递参数

findFn is also aliased to "???", though i don't often use that because i don't think you can pass in arguments other than the function name

对于这里的问题,试试这个:

for the question here, try this:

> library(sos)

> findFn("cosine", maxPages=2, sortby="MaxScore")

传入的附加参数("maxPages=2" 和 "sortby="MaxScore")只是限制了返回结果的数量,并分别指定了结果的排名方式——即找到一个名为‘cosine’的函数' 或者函数描述中有'cosine'这个词的,只返回两页结果,按相关度降序排列"

The additional arguments passed in ("maxPages=2" and "sortby="MaxScore") just limits the number of results returned, and specifies how the results are ranked, respectively--ie, "find a function named 'cosine' or that has the term 'cosine' in the function description, only return two pages of results, and order them by descending relevance score"

上面的 findFn 调用返回一个包含九列的数据框,结果为行 - 呈现为 HTML.

The findFn call above returns a data frame with nine columns and the results as rows--rendered as HTML.

扫描最后一列,描述和链接,您找到的第 21 项:

Scanning the last column, Description and Link, item (row) 21 you find:

余弦测度(矩阵)

这段文字也是一个链接;单击它会将您带到包含该功能的包中该功能的帮助页面 - 换句话说

this text is also a link; clicking on it takes you to the help page for that function in the Package which contains that function--in other words

使用findFn,您可以很快找到您想要的函数即使您不知道它在哪个包中

using findFn, you can pretty quickly find the function you want even though you have no idea which Package it's in

这篇关于查找两个数组之间的余弦相似度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆