R-SpatialPoints(GPS坐标)和SpatialLinesDataFrame之间的空间连接 [英] R - Spatial Join Between SpatialPoints (GPS coordinates) and SpatialLinesDataFrame

查看:123
本文介绍了R-SpatialPoints(GPS坐标)和SpatialLinesDataFrame之间的空间连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个将数据科学和GIS相结合的大学项目.我们需要找到一种开放源代码的解决方案,该解决方案能够从庞大的GPS坐标数据集中获取其他信息.显然,我不能使用任何具有每日请求限制的API.

I am working on a university project which combines data science and GIS. We need to find an open-source solution capable of obtaining additional information from a massive GPS coordinates dataset. Clearly, I cannot use any API with daily request limit.

在这里您可以找到教授提供给我们的数据集样本:

Here you can find a sample of the dataset the Professor provided us:

longitude <- c(10.86361, 10.96062, 10.93032, 10.93103, 10.93212)        
latitude <- c(44.53355, 44.63234, 44.63470, 44.63634, 44.64559)
longlat <- data.frame(longitude, latitude)
ID <- seq.int(1, 10)

第一次任务:已经完成!

第一步是使用rgeosover()将我的SpatialPointsSpatialPolygonsDataFrame连接起来. SpatialPolygonsDataFrame是通过rgeosgetData('GADM', country='ITA', level=3)获得的.
对于第一个已完成的任务,目标是将每个GPS坐标与它们所属的有关CityRegion的信息相关联. 我能够获得的结果的一个示例是:

FIRST TASK: Already Accomplished!

The first step was joining my SpatialPoints with a SpatialPolygonsDataFrame using over() of rgeos. The SpatialPolygonsDataFrame was obtained through getData('GADM', country='ITA', level=3) of rgeos.
For this first accomplished task, the objective was to associate to each GPS coordinates the information about City and Region which they belong to.
An example of the result I was able to obtain is:

require(sp)
require(rgeos)
my_spdf <- SpatialPointsDataFrame(coords = longlat, data = ID, proj4string = CRS(" +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 "))
italy_administrative_boundaries_level3 <- getData('GADM', country='ITA', level=3)
result <- over(my_spdf, italy_administrative_boundaries_level3)[, c("NAME_0", "NAME_1", "NAME_2", "NAME_3")]
result$ID <- ID
print(result)

第二项任务:我的问题

现在这些东西变得棘手了,因为我需要关联其他更深的信息,例如road_nameroad_type.
此信息包含在OpenStreetMap上创建的shapefile中,并在以下网址中提供: download.geofabrik.de/europe/italy.html . 我将shapefile加载到R中,获得了SpatialLinesDataFrame:

SECOND TASK: MY QUESTION

Now the stuff become tricky because I need to associate additional and deeper information like road_name and road_type.
This information are contained in the shapefiles created on OpenStreetMap and available at: download.geofabrik.de/europe/italy.html. I loaded the shapefile in R obtaining a SpatialLinesDataFrame:

require(rgdal)
shapefile_roads <- readOGR(dsn = "./road", layer = "roads")

然后,我天真地尝试应用与加入SpatialPointsSpatialPolygonsDataFrame相同的技术:

Then, I naively tried to apply the same technique as for joining SpatialPoints and SpatialPolygonsDataFrame:

result <- over(my_spdf, shapefile_roads)

很明显,结果仅为NA.我想到的一个可能原因是my_df的坐标不在shapefile_roadsLines的确切位置,因此,我需要某种半径参数.但是,我不确定.

Clearly, the result is just NA. One possible reason that came into my mind was that the coordinates of my_df are not in the exact position of the Lines in shapefile_roads, therefore, I should need some kind of radius parameter. However, I am not really sure.

您能建议我在我的SpatialPoints和从OpenStreetMap的road_shapefile获得的SpatialLinesDataFrame属性之间执行这种空间连接的正确方法吗?

Can you suggest me the correct approach to perform this spatial join between my SpatialPoints and the attributes of the SpatialLinesDataFrame obtained from the road_shapefile of OpenStreetMap?

如果不是很清楚,请不要犹豫.

Please if something is not very clear do not hesitate to ask.

推荐答案

您的示例数据

library(raster)
longitude <- c(10.86361, 10.96062, 10.93032, 10.93103, 10.93212)        
latitude <- c(44.53355, 44.63234, 44.63470, 44.63634, 44.64559)
longlat <- data.frame(longitude, latitude)
ID <- data.frame(ID=1:5)
ita_gadm3 <- getData('GADM', country='ITA', level=3)[, c("NAME_0", "NAME_1", "NAME_2", "NAME_3")]
 #use `sp::over` or `raster::extract`
 result <- extract(ita_gadm3, longlat)

某些道路:

road <- spLines(cbind(longitude+.1, latitude), cbind(longitude-.1, rev(latitude)), cbind(longitude-.1, latitude+1), crs=crs(ita_gadm3))

现在找到最近的路段.您可以使用geosphere::dist2Line,因为您使用的是角(lon/lat)坐标.

Now find the nearest road segment. You can use geosphere::dist2Line because you are using angular (lon/lat) coordinates.

library(geosphere)
geosphere::dist2Line(longlat, road)
#     distance      lon      lat ID
#[1,] 2498.825 10.83212 44.53355  2
#[2,] 5527.646 11.03032 44.63470  1
#[3,] 5524.227 10.86062 44.63634  2
#[4,] 5577.372 10.86062 44.63634  2
#[5,] 5756.113 10.86062 44.63634  2

请注意变量ID,该变量指的是道路.问题是dist2line当前运行缓慢,并且您的数据集很大.

Note the variable ID which refers back to the roads. The problem is that dist2line is currently slow and you have a large data set.

另一种方法是将空间数据转换为适合意大利的平面坐标系并使用gDistance.

The alternative is to transform your spatial data to a planar coordinate system appropriate for Italy and use gDistance.

library(rgeos)
library(rgeos)
sp <- SpatialPoints(longlat, proj4string=crs(ita_gadm3))
spita <- spTransform(sp, "+proj=tmerc +lat_0=0 +lon_0=15 +k=0.9996 +x_0=2520000 +y_0=0 +ellps=intl +units=m")
rdita <- spTransform(road, "+proj=tmerc +lat_0=0 +lon_0=15 +k=0.9996 +x_0=2520000 +y_0=0 +ellps=intl +units=m")

gd <- gDistance(rdita, spita, byid=TRUE)
a <- apply(gd, 1, which.min)
a
#1 2 3 4 5 
#2 1 2 2 2 

也就是说,点2最接近道路1.其他点最接近道路2. 您可能需要分批进行此操作,以避免获得太大的距离矩阵.

That is, point 2 is closest to road 1. The other points are closest to road 2. You probably need to do that in batches of points or tiles to avoid getting a distance matrix that is too large.

Sébastien建议的缓冲区解决方案原则上可以工作,但是由于没有合适的缓冲区大小而变得非常复杂.一方面,点可能在任何缓冲区之外,另一方面,它们可能与多个缓冲区重叠.如果使用缓冲区,则在存在多个匹配项时,sp::over返回任意匹配项,而raster::extract将返回所有匹配项.两者都不是很漂亮,我会避免这种方法.图示在这里:

The buffer solution suggested by Sébastien could work in principle, but gets really complicated as there is no good buffer size. At the one hand, points may be outside any buffer and, at the other hand, they may overlap with several buffers. If you use buffers, sp::over returns an arbitrary match if there are multiple matches, whereas raster::extract will return them all. Neither is pretty, and I would avoid this approach. Illustrated here:

b <- buffer(road, width=.15, dissolve=F)
plot(b)
lines(road, col='red', lwd=2)
points(longlat, pch=20, col='blue')

extract(b, longlat)
#   point.ID poly.ID
#1         1       1
#2         1       2
#3         2       2
#4         2       1
#5         3       2
#6         3       1
#7         4       2
#8         4       1
#9         5       1
#10        5       2

over(sp, b)
#1 2 3 4 5 
#2 2 2 2 2 

这篇关于R-SpatialPoints(GPS坐标)和SpatialLinesDataFrame之间的空间连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆