来自两个单独数据帧的距离矩阵 [英] Distance matrix from two separate data frames

查看:101
本文介绍了来自两个单独数据帧的距离矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个矩阵,其中包含一个数据帧的行与另一数据帧的行的欧式距离.例如,说我有以下数据帧:

I'd like to create a matrix which contains the euclidean distances of the rows from one data frame versus the rows from another. For example, say I have the following data frames:

a <- c(1,2,3,4,5)
b <- c(5,4,3,2,1)
c <- c(5,4,1,2,3)
df1 <- data.frame(a,b,c)

a2 <- c(2,7,1,2,3)
b2 <- c(7,6,5,4,3)
c2 <- c(1,2,3,4,5)
df2 <- data.frame(a2,b2,c2)

我想创建一个矩阵,其中df1中的每一行的距离与df2中的各行的距离.

I would like to create a matrix with the distances of each row in df1 versus the rows of df2.

因此,矩阵[2,1]应该是df1 [2,]和df2 [1,]之间的欧式距离. matrix [3,2] df [3,]与df2 [2,]等之间的距离.

So matrix[2,1] should be the euclidean distance between df1[2,] and df2[1,]. matrix[3,2] the distance between df[3,] and df2[2,], etc.

有人知道如何实现吗?

推荐答案

也许您可以使用fields包:函数rdist可能会满足您的要求:

Perhaps you could use the fields package: the function rdist might do what you want:

rdist:欧几里得距离矩阵
说明:给定两组位置,计算所有配对之间的欧几里得距离矩阵.

rdist : Euclidean distance matrix
Description: Given two sets of locations computes the Euclidean distance matrix among all pairings.

> rdist(df1, df2)
     [,1]     [,2]     [,3]     [,4]     [,5]
[1,] 4.582576 6.782330 2.000000 1.732051 2.828427
[2,] 4.242641 5.744563 1.732051 0.000000 1.732051
[3,] 4.123106 5.099020 3.464102 3.316625 4.000000
[4,] 5.477226 5.000000 4.358899 3.464102 3.316625
[5,] 7.000000 5.477226 5.656854 4.358899 3.464102

pdist包类似的情况

pdist:分区矩阵的观测值之间的距离
说明:计算矩阵X的行与另一个矩阵Y的行之间的欧式距离.

pdist : Distances between Observations for a Partitioned Matrix
Description: Computes the euclidean distance between rows of a matrix X and rows of another matrix Y.

> pdist(df1, df2)
An object of class "pdist"
Slot "dist":
[1] 4.582576 6.782330 2.000000 1.732051 2.828427 4.242640 5.744563 1.732051
[9] 0.000000 1.732051 4.123106 5.099020 3.464102 3.316625 4.000000 5.477226
[17] 5.000000 4.358899 3.464102 3.316625 7.000000 5.477226 5.656854 4.358899
[25] 3.464102
attr(,"Csingle")
[1] TRUE

Slot "n":
[1] 5

Slot "p":
[1] 5

Slot ".S3Class":
[1] "pdist"

#

注意:如果您要在行之间寻找欧几里得范数,则可以尝试:

#

NOTE: If you're looking for the Euclidean norm between rows, you might want to try:

a <- c(1,2,3,4,5)
b <- c(5,4,3,2,1)
c <- c(5,4,1,2,3)
df1 <- rbind(a, b, c)

a2 <- c(2,7,1,2,3)
b2 <- c(7,6,5,4,3)
c2 <- c(1,2,3,4,5)
df2 <- rbind(a2,b2,c2)

rdist(df1, df2)

这给出了:

> rdist(df1, df2)
         [,1]     [,2]     [,3]
[1,] 6.164414 7.745967 0.000000
[2,] 5.099020 4.472136 6.324555
[3,] 4.242641 5.291503 5.656854

这篇关于来自两个单独数据帧的距离矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆