光谱聚类 [英] spectral clustering

查看:156
本文介绍了光谱聚类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我必须说我是matlab(和这个站点...)的新手,所以请原谅我的无知.

First off I must say that I'm new to matlab (and to this site...) , so please excuse my ignorance.

我正在尝试在matlab中编写一个函数,该函数将使用光谱聚类"将一组点分成两个聚类.

I'm trying to write a function in matlab that will use Spectral Clustering to split a set of points into two clusters.

我的代码如下

function Groups = TrySpectralClustering(data)
dist_mat = squareform(pdist(data));

W=  zeros(length(data),length(data));

for i=1:length(data),
    for j=(i+1):length(data),
    W(i,j)=10^(-dist_mat(i,j));
    W(j,i)=W(i,j);
    end
end
D = zeros(length(data),length(data));
for i=1:length(W),
D(i,i)=sum(W(i,:));
end
L=D-W;
L=D^(-0.5)*L*D^(-0.5);
[ V E ] = eig(L);
disp ('V:');
disp (V);

如果我理解正确,那么通过使用第二个最小特征向量,我应该能够将数据划分为两个簇-如果第二个特征向量的第i个成员为正,则第i个数据点将位于第一个数据点中集群,否则它将在另一个集群中.

If I understand correctly, then by using the second smallest eigenvector I should be able to perform a partition of the data into two clusters - If the ith member of the 2nd eigenvector is positive, the ith data point would be in the one cluster, otherwise it would be in the other cluster.

但是,当我尝试以下操作

However, when I try the following

f=[1,1;0,0;1,0;0,1;100,100;100,101;101,101;101,100]
TrySpectralClustering(f)

我希望前四个点组成一个集群,后四个组成另一个集群.

I would expect that the first four points would form one cluster, and the last four would form another.

但是,我收到

V:
   -0.0000   -0.5000    0.0000   -0.5777    0.0000    0.4078   -0.0000    0.5000
   -0.0000   -0.5000    0.0000    0.5777    0.0000   -0.4078   -0.0000    0.5000
   -0.0000   -0.5000    0.0000    0.4078    0.0000    0.5777   -0.0000   -0.5000
   -0.0000   -0.5000    0.0000   -0.4078    0.0000   -0.5777   -0.0000   -0.5000
   -0.5000   -0.0000   -0.0000   -0.0000   -0.7071   -0.0000    0.5000   -0.0000
   -0.5000   -0.0000    0.7071    0.0000   -0.0000   -0.0000   -0.5000   -0.0000
   -0.5000    0.0000   -0.0000    0.0000    0.7071    0.0000    0.5000    0.0000
   -0.5000         0   -0.7071         0         0         0   -0.5000         0

取第二个本征向量

  -0.0000   -0.5000    0.0000    0.5777    0.0000   -0.4078   -0.0000    0.5000

我发现一个聚类包括点1,0; 0,1; 100,100; 101,100 另一个簇是由点1,1; 0,0; ​​100,101; 101,101

I find the one cluster includes the points 1,0;0,1;100,100;101,100 and the other cluster is made from the points 1,1;0,0;100,101;101,101

我想知道我在做错什么.

I wonder what am I doing wrong.

注意:我正在将以上内容作为家庭作业项目的一部分.

Note: I am working on the above as a part of a homework project.

提前谢谢!

推荐答案

您得到的是正确的.令U为包含特征向量的矩阵,如上所述,并对其进行排列,以使第一列对应于最小特征值,而渐进列对应于升序特征值.然后,通过保留对应于较小特征值的特征向量,获取U列的子集.现在,将这些列逐行读取到一组新的向量中,并将其称为Y.聚类Y以获取光谱聚类.因此,让我们假设我们的子集只是第一列.我们清楚地看到,如果您要对第一列进行聚类,则会将前4个聚类成1个聚类,然后将其后4个聚类成另一个聚类,这就是您想要的.

What you are getting is correct. Let U be the matrix containing the eigenvectors as shown above and let them be arranged such that the 1st column corresponds to the smallest eigenvalue and progressive columns correspond to the ascending eigenvalues. Then, take a subset of columns of U by retaining the eigenvectors corresponding to the smaller eigenvalues. Now, read these columns row-wise into a new set of vectors, call it Y. Cluster Y to get the spectral clusters. So, let us assume our subset is only the first column. We clearly see that if u were to cluster the first column, u would get the first 4 into 1 cluster and the next 4 into another cluster, which is what you want.

这篇关于光谱聚类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆