使用带有循环的不同长度的不同数据帧中的纬度和经度数据计算距离 [英] Calculate Distance using Latitude and Longitude data in Different Data frames of different lengths with loop
问题描述
我有2个不同长度的数据帧,每个数据帧都有一个经度和纬度坐标.我想通过计算纬度/经度之间的距离来连接两个数据框.
I have 2 data frames of different lengths, each with a longitude and latitude coordinate. I would like to connect the two data frames by calculating the distance between the lat/long points.
为简单起见,数据帧A(起点)具有以下结构
For simplicity, Data frame A (starting point) has the following structure
ID long lat
1 -89.92702 44.19367
2 -89.92525 44.19654
3 -89.92365 44.19756
4 -89.91949 44.19848
5 -89.91359 44.19818
数据帧B(端点)具有相似的结构,但更短
And Data frame B (end point) has a similar structure but shorter
ID LAT LON
1 43.06519 -87.91446
2 43.14490 -88.07172
3 43.08969 -87.91202
我想计算每个点之间的距离,这样我将以合并到A的数据帧结束,该数据帧具有A1和B1,A1和B2,A1和B3之间的距离.此外,对于A $ ID中的所有A值以及B $ ID的所有值,应重复此操作
I would like to calculate the distance between each point such that I would end with a data frame, merged to A, that has the distances between A1 and B1, A1 and B2, A1 and B3. Furthermore, this should repeat for all values of A in A$ID with all values of B$ID
A$ID B$ID
1 1
2 2
3 3
4
5
在发布此内容之前,我咨询了几个Stack Overflow线程(包括这篇中等帖子,但是我不确定如何处理循环,尤其是由于列表的长度不同.
Prior to posting this, I consulted several Stack Overflow threads (including this one and This Medium post but I am not sure how to approach the looping, especially since the lists are of different lengths.
谢谢!
推荐答案
这是使用两个软件包的解决方案: sf
和 tidyverse
.第一个用于将数据转换为简单特征并计算距离;同时,第二个用于将数据放入所需的格式.
Here's a solution using two packages: sf
and tidyverse
. The first one is used to convert the data into simple features and calculate the distance; while, the second one is used to put the data in the desired format.
library(tidyverse)
library(sf)
# Transform data into simple features
sfA <- st_as_sf(A, coords = c("long","lat"))
sfB <- st_as_sf(B, coords = c("LON","LAT"))
# Calculate distance between all entries of sf1 and sf2
distances <- st_distance(sfA, sfB, by_element = F)
# Set colnames for distances matrix
colnames(distances) <- paste0("B",1:3)
# Put the results in the desired format
# Transform distances matrix into a tibble
as_tibble(distances) %>%
# Get row names and add them as a column
rownames_to_column() %>%
# Set ID as the column name for the row numbers
rename("ID" = "rowname") %>%
# Transform ID to numeric
mutate_at(vars(ID), as.numeric) %>%
# Join with the original A data frame
right_join(A, by = "ID") %>%
# Change the order of columns
select(ID, long, lat, everything()) %>%
# Put data into long format
pivot_longer(cols = starts_with("B"),
names_to = "B_ID",
names_pattern = "B(\\d)",
values_to = "distance")
这篇关于使用带有循环的不同长度的不同数据帧中的纬度和经度数据计算距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!