python scipy/numpy中的相关性的层次聚类? [英] hierarchical clustering on correlations in Python scipy/numpy?

查看:103
本文介绍了python scipy/numpy中的相关性的层次聚类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在scipy/numpy中的相关矩阵上运行分层聚类?我有一个100行乘9列的矩阵,我想通过9个条件下每个条目的相关性按层次进行聚类.我想使用1-皮尔逊相关性作为聚类的距离.假设我有一个包含100 x 9矩阵的numpy数组X,该怎么办?

How can I run hierarchical clustering on a correlation matrix in scipy/numpy? I have a matrix of 100 rows by 9 columns, and I'd like to hierarchically cluster by correlations of each entry across the 9 conditions. I'd like to use 1-pearson correlation as the distances for clustering. Assuming I have a numpy array X that contains the 100 x 9 matrix, how can I do this?

根据此示例,我尝试使用hcluster:

I tried using hcluster, based on this example:

Y=pdist(X, 'seuclidean')
Z=linkage(Y, 'single')
dendrogram(Z, color_threshold=0)

但是,pdist不是我想要的,因为这是欧氏距离.有什么想法吗?

However, pdist is not what I want, since that's a euclidean distance. Any ideas?

谢谢.

推荐答案

只需将指标更改为correlation,以使第一行变为:

Just change the metric to correlation so that the first line becomes:

Y=pdist(X, 'correlation')

但是,我认为代码可以简化为:

However, I believe that the code can be simplified to just:

Z=linkage(X, 'single', 'correlation')
dendrogram(Z, color_threshold=0)

因为链接将为您处理pdist.

because linkage will take care of the pdist for you.

这篇关于python scipy/numpy中的相关性的层次聚类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆