如何计算特征列表之间的相似度? [英] How to compute the similarity between lists of features?

查看：1138 发布时间：2020/5/4 10:23:03 python numpy machine-learning

本文介绍了如何计算特征列表之间的相似度?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有用户和资源.每个资源由一组功能描述，每个用户与一组不同的资源相关.在我的特定情况下，资源是网页，并且有关访问位置，访问时间，访问次数等的功能信息每次都与特定用户相关联.

I have users and resources. Each resource is described by a set of features and each user is related to a different set of resources. In my particular case, the resources are web pages, and the features information about the location of the visit, the time of the visit, the number of visit etc, which are tied to a specific user each time.

我想在用户之间就这些功能进行相似性度量，但是我找不到将资源功能汇总到一起的方法.我已经使用文本功能完成了此操作，因为可以将文档一起添加，然后提取功能(例如TF-IDF)，但是我不知道如何进行此配置.

I want to get a similarity measure between my users regarding those features but I can't find a way to aggregate the resource features together. I've done it with text features, as it is possible to add the documents together and then extract features (say TF-IDF), but I don't know how to proceed with this configuration.

为了清楚起见，这是我所拥有的:

To be as clear as possible, here is what I have:

>>> len(user_features)
13 # that's my number of users
>>> user_features[0].shape
(2374, 17) # 2374 documents for this user, and 17 features

例如，我可以使用欧式距离获得文档的相似度矩阵 :

I'm able to get a similarity matrix of the documents using euclidean distances for instance:

>>> euclidean_distance(user_features[0], user_features[0])

但是我不知道如何将用户彼此进行比较.我应该以某种方式将这些功能汇总在一起，最后得到一个N_Users X N_Features矩阵，但是我不知道如何.

But I don't know how do I compare the users against each other. I should somehow aggregate the features together to end up with a N_Users X N_Features matrix, but I don't know how.

关于如何进行操作的任何提示?

Any hints on how to proceed?

有关我正在使用的功能的更多信息:

Some more information about the features I'm using:

我在此处具有的功能尚未完全修复.到目前为止，我已经获得了13种不同的功能，这些功能已经从视图"中汇总了.我所拥有的是每个视图的标准差，均值等，以便具有某种平坦"的特征，以便能够对其进行比较.我拥有的功能之一是:自上次查看以来位置是否已更改?大约一个小时前呢?两个小时前?

The features I have here are not completely fixed. What I've got so far is 13 different features, already aggregated from "views". What I have is standard deviation, mean, etc. for each of the views, in order to have something "flat", to be able to compare them. One of the feature I have is: was the location changed since the last view? And what about one hour ago? Two hours ago?

如何计算特征列表之间的相似度? [英] How to compute the similarity between lists of features?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

如何计算特征列表之间的相似度? [英] How to compute the similarity between lists of features?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭