python scipy/numpy中的相关性的层次聚类? [英] hierarchical clustering on correlations in Python scipy/numpy?
问题描述
如何在scipy
/numpy
中的相关矩阵上运行分层聚类?我有一个100行乘9列的矩阵,我想通过9个条件下每个条目的相关性按层次进行聚类.我想使用1-皮尔逊相关性作为聚类的距离.假设我有一个包含100 x 9矩阵的numpy
数组X
,该怎么办?
How can I run hierarchical clustering on a correlation matrix in scipy
/numpy
? I have a matrix of 100 rows by 9 columns, and I'd like to hierarchically cluster by correlations of each entry across the 9 conditions. I'd like to use 1-pearson correlation as the distances for clustering. Assuming I have a numpy
array X
that contains the 100 x 9 matrix, how can I do this?
根据此示例,我尝试使用hcluster:
I tried using hcluster, based on this example:
Y=pdist(X, 'seuclidean')
Z=linkage(Y, 'single')
dendrogram(Z, color_threshold=0)
但是,pdist
不是我想要的,因为这是欧氏距离.有什么想法吗?
However, pdist
is not what I want, since that's a euclidean distance. Any ideas?
谢谢.
推荐答案
只需将指标更改为correlation
,以使第一行变为:
Just change the metric to correlation
so that the first line becomes:
Y=pdist(X, 'correlation')
但是,我认为代码可以简化为:
However, I believe that the code can be simplified to just:
Z=linkage(X, 'single', 'correlation')
dendrogram(Z, color_threshold=0)
因为链接将为您处理pdist.
because linkage will take care of the pdist for you.
这篇关于python scipy/numpy中的相关性的层次聚类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!