R中的Mahalanobis距离 [英] Mahalanobis distance in R

查看:233
本文介绍了R中的Mahalanobis距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在StatMatch包中找到了mahalanobis.dist函数( http: //cran.r-project.org/web/packages/StatMatch/StatMatch.pdf ),但它并未完全满足我的要求。似乎正在计算从data.y中的每个观测值到数据中每个观测值的马哈拉诺比斯距离。x

I have found the mahalanobis.dist function in package StatMatch (http://cran.r-project.org/web/packages/StatMatch/StatMatch.pdf) but it isn't doing exactly what I want. It seems to be calculating the mahalanobis distance from each observation in data.y to each observation in data.x

我想计算一个数据中的观测值的马哈拉诺比斯距离.y对data.x中的所有观测值。如果可以的话,基本上计算一个点到一个点云的马哈拉诺比斯距离。有点像观察的概率是另一组观察的一部分

I would like to calculate the mahalanobis distance of one observation in data.y to all observations in data.x. Basically calculate a mahalanobis distance of one point to a "cloud" of points if that makes sense. Kind of getting at the idea of the probability of an observation being part of another group of observations

此人( http://people.revoledu.com/kardi/tutorial/Similarity/MahalanobisDistance.html )似乎正在这样做,而我ve试图在R中复制他的过程,但是当我到达等式的底部时,它就失败了:

This person (http://people.revoledu.com/kardi/tutorial/Similarity/MahalanobisDistance.html) seems to be doing this and I've tried to replicate his process in R but it is failing when I get to the bottom part of the equation:

mahaldist = sqrt((inversepooledcov %*% t(meandiffmatrix)) %*% meandiffmatrix)

所有代码都是在这里使用

All the code I am working with is here:

a = rbind(c(2,2), c(2,5), c(6,5),c(7,3))

colnames(a) = c('x', 'y')

b = rbind(c(6,5),c(3,4))

colnames(b) = c('x', 'y')

acov = cov(a)
bcov = cov(b)

meandiff1 = mean(a[,1]) - mean(b[,1])

meandiff2 = mean(a[,2]) - mean(b[,2])

meandiffmatrix = rbind(c(meandiff1,meandiff2))

totaldata = dim(a)[1] + dim(b)[1]

pooledcov = (dim(a)[1]/totaldata * acov) + (dim(b)[1]/totaldata * bcov)

inversepooledcov = solve(pooledcov)

mahaldist = sqrt((inversepooledcov %*% t(meandiffmatrix)) %*% meandiffmatrix)


推荐答案

我已经一直在您浏览过的网站上尝试过,然后偶然发现了这个问题。我设法使脚本正常工作,但是得到了不同的结果。

I've been trying this out from the same website that you looked at and then stumbled upon this question. I managed to get the script to work, But I get a different result.

#WORKING EXAMPLE
#MAHALANOBIS DIST OF TWO MATRICES

#define matrix
mat1<-matrix(data=c(2,2,6,7,4,6,5,4,2,1,2,5,5,3,7,4,3,6,5,3),nrow=10)
mat2<-matrix(data=c(6,7,8,5,5,5,4,7,6,4),nrow=5)
#center data
mat1.1<-scale(mat1,center=T,scale=F)
mat2.1<-scale(mat2,center=T,scale=F)
#cov matrix
mat1.2<-cov(mat1.1,method="pearson")
mat2.2<-cov(mat2.1,method="pearson")
n1<-nrow(mat1)
n2<-nrow(mat2)
n3<-n1+n2
#pooled matrix
mat3<-((n1/n3)*mat1.2) + ((n2/n3)*mat2.2)
#inverse pooled matrix
mat4<-solve(mat3)
#mean diff
mat5<-as.matrix((colMeans(mat1)-colMeans(mat2)))
#multiply
mat6<-t(mat5) %*% mat4
#multiply
sqrt(mat6 %*% mat5)

我认为函数 mahalanobis()用于计算之间的马哈拉诺比斯距离在一个矩阵中的单个人(行)。 package(HDMD)中的函数 pairwise.mahalanobis()可以比较两个或多个矩阵,并给出两个矩阵。

I think the function mahalanobis() is used to compute mahalanobis distances between individuals (rows) in one matrix. The function pairwise.mahalanobis() from package(HDMD) can compare two or more matrices and give mahalanobis distances between the matrices.

这篇关于R中的Mahalanobis距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆