K表示马哈拉诺比斯语-奇点 [英] K means with Mahalanobis - Singularity

查看:121
本文介绍了K表示马哈拉诺比斯语-奇点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,我正在做K-均值聚类,即将实现马氏距离.我有时矩阵是奇异的.我不确定在这种情况下是什么意思,怎么办?我相当确定我的代码可以,但是这里是用于计算协方差矩阵的代码:

Hello
Im doing K-means clustering and am about to implement the Mahalanobis distance. I have a problem with sometimes the matrix is singular. Im not really sure what it means in this case and what to do about it? Im fairly sure that my code is ok, but here is the code for calculating the covariance matrix:

<br />public static Matrix CovarianceMatrix(List<double[]> dataset)<br />        {<br />            /*<br />                cov_xx cov_xy ...<br />                cov_yx cov_yy ...<br />                ...<br />             */<br /><br />            //Calculate mean for this cluster<br />            // cov_xx = sum[x*x]/n, cov_xy = sum[x*y]/n<br />            double[] means = new double[dataset[0].Length];<br />            Matrix cov = new Matrix(dataset[0].Length, dataset[0].Length);<br />            double sum = 0;<br /><br />            for (int i = 0; i < dataset[0].Length; i++)<br />            {<br />                for (int j = 0; j < dataset.Count; j++)<br />                {<br />                    means[i] += dataset[j][i];<br />                }<br />                means[i] /= dataset.Count;<br />            }<br /><br />            double[,] subresults = new double[dataset[0].Length, dataset.Count];<br />            for (int j = 0; j < dataset.Count; j++)<br />            {<br />                for (int i = 0; i < dataset[0].Length; i++)<br />                {<br />                    subresults[i, j] = dataset[j][i] - means[i];<br />                }<br />            }<br />            <br />            //fill covariance<br />            for (int i = 0; i < dataset[0].Length; i++)<br />            {<br />                for (int j = i; j < dataset[0].Length; j++)<br />                {<br />                    double s = 0;<br />                    for (int x = 0; x < dataset.Count; x++)<br />                    {<br />                        s += subresults[i, x] * subresults[j, x];<br />                    }<br />                    cov.SetElement(i, j, s / dataset.Count);<br />                    if (i != j) cov.SetElement(j, i, s / dataset.Count);<br />                }<br />            }<br />            return cov;<br />        }<br />



这里是距离:



And here for the distance:

<br />        public static double Mahalanobis(double[] vector1, double[] vector2, Matrix covariance)<br />        {<br />            Matrix v1 = new Matrix(vector1, vector1.Length);<br />            Matrix v2 = new Matrix(vector2, vector2.Length);<br />            Matrix m = v1.Subtract(v2);<br /><br />            return (double)(m.Transpose()).Multiply(covariance.Inverse()).Multiply(m).GetElement(0, 0);<br />        }<br />


如果需要更多信息(或评论)或有效的代码示例,请告诉我.但是,有时它可以群集在一起而不会出现问题,因此我认为它更多地是关于如何处理奇异性而不是其自身的代码.

期待您的回音


If more information (or comments), or a working code sample is perfered, let me know. However, some times it can cluster without problem, so I think it is more about how to handle the singularity than the code it self.

Looking forward to hear from you

于2009年5月22日星期五5:32修改
modified on Friday, May 22, 2009 5:32 AM

推荐答案

奇异矩阵的行列式为零.这意味着您无法反转.

当您反转协方差矩阵时可能会发生这种情况:covariance.Inverse()

这也意味着您的协方差矩阵不是正半定的,因此它是不可逆的.因此,您要对Mahalanobis函数使用的向量可能是彼此的线性组合.

如果这些向量是随机向量,则可能是该向量的一部分是多余的.最好检查一下向量.
A singular matrix has a determinant of zero. That means you can''t invert it.

That''s probably happening when you''re inverting your covariance matrix: covariance.Inverse()

It also means that your covariance matrix isn''t positive semi-definite and therefore it''s not invertable. So the vectors you are seding to the Mahalanobis function are probably linear combinations of one another.

If these are random vectors, it could be that a component of the vector is extraneous. Better check your vectors.


这篇关于K表示马哈拉诺比斯语-奇点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆