Geosphere距离矩阵:避免重复演算 [英] Matrix of distances with Geosphere: avoid repeat calculus
问题描述
我想使用来自 geosphere
的 distm
来计算非常大的矩阵中所有点之间的距离.
I want to compute the distance among all points in a very large matrix using distm
from geosphere
.
请参阅一个最小示例:
library(geosphere)
library(data.table)
coords <- data.table(coordX=c(1,2,5,9), coordY=c(2,2,0,1))
distances <- distm(coords, coords, fun = distGeo)
问题在于,由于我要计算的距离的性质, distm
给了我一个对称矩阵,因此,我可以避免计算一半以上的距离:
The issue is that due to the nature of the distances I am computing, distm
gives me back a symmetric matrix, therefore, I could avoid to calculate more than half of the distances:
structure(c(0, 111252.129800202, 497091.059564718, 897081.91986428,
111252.129800202, 0, 400487.621661164, 786770.053508848, 497091.059564718,
400487.621661164, 0, 458780.072878927, 897081.91986428, 786770.053508848,
458780.072878927, 0), .Dim = c(4L, 4L))
您能帮我找到一种更有效的方法来计算所有这些距离,而不必每次都做两次吗?
May you help me to find a more efficient way to compute all those distances avoiding doing twice each one?
推荐答案
如果要计算点 x
的所有成对距离,最好使用 distm(x)
而不是 distm(x,x)
. distm
函数在两种情况下都返回相同的对称矩阵,但是当您将其传递给单个参数时,它知道矩阵是对称的,因此不会进行不必要的计算.
If you want to compute all pairwise distances for points x
, it is better to use distm(x)
rather than distm(x,x)
. The distm
function returns the same symmetric matrix in both cases but when you pass it a single argument it knows that the matrix is symmetric, so it won't do unnecessary computations.
您可以计时.
library("geosphere")
n <- 500
xy <- matrix(runif(n*2, -90, 90), n, 2)
system.time( replicate(100, distm(xy, xy) ) )
# user system elapsed
# 61.44 0.23 62.79
system.time( replicate(100, distm(xy) ) )
# user system elapsed
# 36.27 0.39 38.05
您还可以查看 geosphere :: distm
的R代码,以检查对两种情况的区别对待.
You can also look at the R code for geosphere::distm
to check that it treats the two cases differently.
在旁边:谷歌快速搜索找到 parallelDist
:在CRAN上的并行距离矩阵计算.测地距离是一个选择.
Aside: Quick google search finds parallelDist
: Parallel Distance Matrix Computation on CRAN. The geodesic distance is an option.
这篇关于Geosphere距离矩阵:避免重复演算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!