mlpack与余弦距离最近的邻居? [英] mlpack nearest neighbor with cosine distance?

查看:91
本文介绍了mlpack与余弦距离最近的邻居?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在mlpack中使用NeighborSearch类对表示文档的某些矢量执行KNN分类.

I'd like to use the NeighborSearch class in mlpack to perform KNN classification on some vectors representing documents.

我想使用余弦距离,但是遇到了麻烦.我认为做到这一点的方法是使用内积指标"IPMetric"并指定CosineDistance内核...这就是我所拥有的:

I'd like to use Cosine Distance, but I'm having trouble. I think the way to do this is to use the inner-product metric "IPMetric" and specify the CosineDistance kernel... This is what I have:

NeighborSearch<NearestNeighborSort, IPMetric<CosineDistance>> nn(X_train);

但是出现以下编译错误:

But I get the following compile errors:

/usr/include/mlpack/core/tree/hrectbound_impl.hpp:211:15: error: ‘Power’ is not a member of ‘mlpack::metric::IPMetric<mlpack::kernel::CosineDistance>’
 sum += pow((lower + fabs(lower)) + (higher + fabs(higher)),
           ^
/usr/include/mlpack/core/tree/hrectbound_impl.hpp:220:3: error: ‘TakeRoot’ is not a member of ‘mlpack::metric::IPMetric<mlpack::kernel::CosineDistance>’
if (MetricType::TakeRoot)
^

我怀疑问题可能出在默认的树类型KDTree不支持该距离度量?如果是这样的话,那么是否存在一种适用于CosineDistance的树类型?

I suspect that the problem may be that the default tree type, KDTree, does not support this distance metric? If that's the issue, is there a tree type that does work for CosineDistance?

最后,可以使用蛮力搜索吗?我似乎根本找不到一种不使用任何树的方法...

Finally, is it possible to use a brute-force search? I can't seem to find a way to use no tree at all...

谢谢!

推荐答案

不幸的是,就像您怀疑的那样,任意度量标准类型不适用于KDTree-这是因为kd-tree需要一段可以分解的距离分成不同的尺寸.但这对于IPMetric是不可能的.相反,为什么不尝试使用覆盖树呢?该树的构建时间可能会更长一些,但它应该具有可比的性能:

Unfortunately, like you suspected, arbitrary metric types don't work with the KDTree---this is because the kd-tree requires a distance that can be decomposed into different dimensions. But that is not possible with IPMetric. Instead, why not try using the cover tree? The build time of the tree may be somewhat longer but it should give comparable performance:

NeighborSearch<NearestNeighborSort, IPMetric<CosineDistance>, arma::mat,
    tree::StandardCoverTree> nn(X_train);

如果要进行暴力搜索,请在构造函数中指定搜索模式:

If you want to do brute-force search, specify the search mode in the constructor:

NeighborSearch<NearestNeighborSort, IPMetric<CosineDistance>, arma::mat,
    tree::StandardCoverTree> nn(X_train, NAIVE_MODE);

我希望这会有所帮助;让我知道我是否可以澄清.

I hope this is helpful; let me know if I can clarify anything.

这篇关于mlpack与余弦距离最近的邻居?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆