在 data.table 上使用 geosphere distm 函数计算距离 [英] Using the geosphere distm function on a data.table to calculate distances

查看:55
本文介绍了在 data.table 上使用 geosphere distm 函数计算距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个包含 6 列的 data.table.我的 data.table 有一列比较两个位置:位置 1 和位置 2.我正在尝试使用 distm 函数计算每行位置之间的距离,创建第 7 列.geosphere 包中的 distm 包需要两个不同的向量来计算每个纬度/经度组合.我下面的代码不起作用,所以我想弄清楚如何为函数提供向量.

I've created a data.table in that has 6 columns. My data.table has a columns compairing two locations: Location 1 and Location 2. I'm trying to use the distm function to calculate the distance between the locations on each row, creating a 7th column. The distm package in the geosphere package requires two different vectors for each lat/long combo to be calculated against. My code below does not work, so I'm trying to figure out how to provide vectors to the function.

LOC_1_ID LOC1_LAT_CORD LOC1_LONG_CORD LOC_2_ID LOC2_LAT_CORD LOC2_LONG_CORD
 1       35.68440        -80.48090        70624    34.86752   -82.46632
 6       35.49770        -80.62870        70624    34.86752   -82.46632
10       35.66042        -80.50053        70624    34.86752   -82.46632

假设 res 保存了 data.table,下面的代码不起作用.

Assuming res holds the data.table the below code does not work.

 res[,DISTANCE := distm(c(LOC1_LAT_CORD, LOC1_LONG_CORD),c(LOC2_LAT_CORD, LOC2_LONG_CORD), fun=distHaversine)*0.000621371]

如果我要提取每个向量,则该函数可以正常工作.

If I were to pull out each vector the function works fine.

loc1 <- res[LOC1_ID == 1,.(LOC1_LAT_CORD, LOC1_LONG_CORD)]
loc2 <- res[LOC2_ID==70624,.(LOC2_LAT_CORD, LOC2_LONG_CORD)]
distm(loc1, loc2, fun=distHaversine)

真的,我的问题是当该函数需要向量作为参数时,如何应用函数来选择 data.table 中的列.

Really, my question is how to apply functions to select columns within a data.table when that function requires vectors as parameters.

推荐答案

distm 函数生成一组点的距离矩阵.如果您只是比较每一行上的点并添加一列,您确定这是您想要的功能吗?

The distm fucntion generates a Distance matrix of a set of points. Are you sure this is the function you want if you're just comparing the points on each row, and adding one column?

听起来您实际上想要 distHaversinedistGeo

It sounds like you actually want either distHaversine or distGeo

library(data.table)
library(geosphere)

dt <- read.table(text = "LOC_1_ID LOC1_LAT_CORD LOC1_LONG_CORD LOC_2_ID LOC2_LAT_CORD LOC2_LONG_CORD
1       35.68440        -80.48090        70624    34.86752   -82.46632
6       35.49770        -80.62870        70624    34.86752   -82.46632
10       35.66042        -80.50053        70624    34.86752   -82.46632", header = T)

setDT(dt)
dt[, distance_hav := distHaversine(matrix(c(LOC1_LONG_CORD, LOC1_LAT_CORD), ncol = 2),
                                   matrix(c(LOC2_LONG_CORD, LOC2_LAT_CORD), ncol = 2))]

#     LOC_1_ID LOC1_LAT_CORD LOC1_LONG_CORD LOC_2_ID LOC2_LAT_CORD LOC2_LONG_CORD distance_hav
# 1:        1      35.68440      -80.48090    70624      34.86752      -82.46632     202046.3
# 2:        6      35.49770      -80.62870    70624      34.86752      -82.46632     181310.0
# 3:       10      35.66042      -80.50053    70624      34.86752      -82.46632     199282.1

<小时>

更新:这个答案提供了一个更高效的 distHaversine 版本,用于 >data.table


Update: This answer gives a more efficient version of distHaversine for use in data.table

这篇关于在 data.table 上使用 geosphere distm 函数计算距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆