如何计算多个纬度和经度数据之间的距离? [英] How can I calculate distance between multiple latitude and longitude data?

查看:74
本文介绍了如何计算多个纬度和经度数据之间的距离?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有1100个站点位置(经度和纬度)数据和10000个房屋位置(经度和纬度)数据.是否可以使用R代码计算每个房屋的车站与房屋之间的最小距离?我还希望每个房屋的距离最小的车站.有可能吗?

I have 1100 station location (latitude and longitude) data and 10000 house location (latitude and longitude) data. Is it possible to calculate the lowest distance between station and house for each house by using R codes? I also want the station that gives the lowest distance for each house. Is it possible?

推荐答案

下面是一个玩具示例,用于查找 m 个点与 n 个城市之间的质量距离.它应该直接转换为您的车站/房屋问题.

Here's a toy example for finding mass distances between m points and n cities. It should translate directly to your station/house problem.

我抚养了大城市,旋转了地球(可以这么说),然后停在了四个城市.然后我再次旋转并停在两点.这里的两个数是无关紧要的:如果我们有4和2或1100和10000,那应该没什么大不了的.

I brought up worldcities, spun the globe (so to speak), and stopped on four cities. I then spun again and stopped at two points. The two counts here are immaterial: if we have 4 and 2 or 1100 and 10000, it should not matter much.

worldcities <- read.csv(header = TRUE, stringsAsFactors = FALSE, text = "
lat,lon
39.7642548,-104.9951942
48.8588377,2.2770206
26.9840891,49.4080842
13.7245601,100.493026")

coords <- read.csv(header = TRUE, stringsAsFactors = FALSE, text = "
lat,lon
27.9519571,66.8681431
40.5351151,-108.4939948")

(快速笔记...通常,至少在我的经验中,工具会根据纬度,经度"为我们提供坐标.但是, geosphere 函数采用的是经度,纬度".所以我上面的坐标是直接从Google地图的随机视图中复制的,我不想编辑它们;因此,我使用 [,2:1] 列索引反转了下面的列.忘记并给出绝对不正确的坐标,您将得到错误 .pointsToMatrix(p1)中的错误:纬度< -90 ,这应该是您可能已经颠倒了顺序的产品您的座标.这时您会挠头,想知道您所有其他项目是否使用了错误的座标,这对您的结论提出了质疑.不是我,我从没来过.今年

(A quick note ... often, tools give us coordinates in "latitude, longitude", at least in my experience. geosphere functions, however, assumes "longitude, latitude". So my coordinates above were copied straight from random views in google maps, and I didn't want to edit them; because of this, I reverse the columns below with [,2:1] column indexing. If you forget and give coordinates that are undeniably not correct, you'll get the error Error in .pointsToMatrix(p1) : latitude < -90, which should be a prod that you have likely reversed the order of your coordinates. At which point you scratch your head and wonder if all of your other projects have used the wrong coordinates, calling into question your conclusions. Not me, I've never been there. This year.)

让我们找到每条坐标(每行)与每个城市(每列)之间的距离(以米为单位):

Let's find the distance in meters between each of coords (each row) and each city (each column):

dists <- outer(seq_len(nrow(coords)), seq_len(nrow(worldcities)),
               function(i, j) geosphere::distHaversine(coords[i,2:1], worldcities[j,2:1]))
dists
#            [,1]    [,2]     [,3]     [,4]
# [1,] 12452329.0 5895577  1726433  3822220
# [2,]   309802.8 7994185 12181477 13296825

直接找出哪个城市与每个坐标最接近

It should be straight-forward to find which city is closest to each coordinate, with

apply(dists, 1, which.min)
# [1] 3 1

也就是说,第一个点最靠近第三个城市,第二个点最靠近第一个城市.

That is, the first point is closest to the third city, and the second point is closest to the first city.

只是为了证明这是一个适用于大量货币对的解决方案,这是同样的问题,而且规模有所扩大.

Just to prove this is a tenable solution for a large number pairs, here's the same problem scaled up a bit.

worldcities_big <- do.call(rbind, replicate(250, worldcities, simplify = FALSE))
nrow(worldcities_big)
# [1] 1000
coords_big <- do.call(rbind, replicate(5000, coords, simplify = FALSE))
nrow(coords_big)
# [1] 10000
system.time(
  dists <- outer(seq_len(nrow(coords_big)), seq_len(nrow(worldcities_big)),
                 function(i, j) geosphere::distHaversine(coords_big[i,2:1], worldcities_big[j,2:1]))
)
#    user  system elapsed 
#   67.62    2.22   70.03 

所以,它不是瞬时的,但是对于10,000,000个距离计算而言,70秒并不可怕.你能使它更快吗?也许,不确定确切的方法,轻松.我认为有些启发式方法可能会将其从 O(m * n)时间减少到 O(m * log(n)),但是我不知道那是否是值得介绍的编码复杂性.

So yes, it was not instantaneous, but 70 seconds is not horrible for 10,000,000 distance calculations. Could you make it faster? Perhaps, not sure precisely how, easily. I'd think some heuristics might reduce it to O(m*log(n)) from O(m*n) time, but I don't know if that's worth the coding complexity it'll introduce.

这篇关于如何计算多个纬度和经度数据之间的距离?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆