缺少值的向量之间的距离 [英] Distance between vectors with missing values

查看:52
本文介绍了缺少值的向量之间的距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于向量 A B ,欧氏距离为: sqrt((A1-B1)^ 2 +(A2-B2)^ 2 +...+(An-Bn)^ 2)

For vectors A and B, euclidean distance is:sqrt((A1-B1)^2+(A2-B2)^2+...+(An-Bn)^2)

A <- c(5, 4, 3, 2, 1, 1, 2, 3, 5)
B <- c(1, 0, 6, 4, 3, 2, 3, 1, 3)
dist(rbind(A,B), method= "euclidean") 
7.681146

当向量A和B包含缺失值时,如何计算距离?这是一个示例:距离的R输出为 8.485281 ,但如何计算?

How is distance calculated when vectors A and B contain missing values? Here is an example: R output for distance is 8.485281 but how is it calculated?

A <- c(5, NA, NA, NA, 1, 1, 2, 3, 5)
B <- c(1, 0, 6, NA, NA, NA, NA, 1, 3)
dist(rbind(A,B), method= "euclidean")
8.485281

推荐答案

首先删除具有 NA 的条目,然后按比例扩大距离以考虑整个样本的较大维度:

Entries with NA are first removed, then the distance is scaled up to account for the larger dimension of the full sample:

i <- is.na(A) | is.na(B)
dist(rbind(A[!i], B[!i])) * sqrt(length(A) / length(A[!i]))
#          A2
# B2 8.485281

这篇关于缺少值的向量之间的距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆