在地图上绘制插值数据 [英] Plotting interpolated data on map

查看:285
本文介绍了在地图上绘制插值数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我收集了美国切萨皮克湾不同地点的物种丰富度调查数据,我想以图形方式将数据呈现为热图。

我有一个纬度/长度坐标和丰富度值的数据框,我将它转换为 SpatialPointsDataFrame ,并使用 autoKrige() code>函数从automap包生成插值。

首先,任何人都可以评论我是否正确实现了 autoKrige()函数?

其次,我无法绘制数据并覆盖该地区的地图。或者,我可以指定插值网格来反映海湾的边界(如建议这里)?关于如何做到这一点以及我可能在哪里获得这些信息的任何想法?将网格提供给 autoKrige()看起来很简单。




编辑:感谢保罗为他的超级有用的职位!这是我现在拥有的。无法让ggplot接受插值数据和地图投影:

  require(rgdal)
require(automap )
#产生经度/纬度坐标和丰富度数据
set.seed(6)
df = data.frame(
lat = sample(seq(36.9,39.3,by = 0.01),100,rep = T),
long =样本(seq(-76.5,-76,by = 0.01),100,rep = T),
fd = runif(10,0, 10))
initial.df = df

#将数据帧转换为SpatialPointsDataFrame
坐标(df)=〜long + lat

#项目latlong坐标到一个椭圆上
proj4string(df)=+ proj = longlat + ellps = WGS84 + datum = WGS84 + no_defs
#+ proj =投影类型(经纬度)
# + ellps和+ datum =由地球表示的椭圆中的不规则性

#将投影转换为欧几里德距离
project_df = spTransform(df,CRS(+ proj = merc + zone = 18s + ellps = WGS84 + datum = WGS84))#projInfo(type =proj)

#使用克里金执行插值
kr = autoKrige(fd〜1,proj ect_df)
#提取输出并转换为数据框,以便使用ggplot2绘图
kr.output = as.data.frame(kr $ krige_output)
#绘制输出
#加载切萨皮克湾地图数据
cb = data.frame(map(state,xlim = range(initial.df $ long),ylim = range(initial.df $ lat),plot = F)并[c( X, Y)])

ggplot()+
geom_tile(数据= kr.output,AES(X = X1,Y = X2,填充= VAR1 .pred))+
geom_path(data = cb,aes(x = x,y = y))+
coord_map(projection =mercator)

解决方案

使用克里格



我看到您正在使用地统计学来构建热图。您还可以考虑其他插值技术,如样条线(例如字段包中的薄板样条线)。这些对数据的假设较少(例如稳定性),并且还可以很好地显示数据。如果您将其发送给期刊,假设数量的减少可能会有所帮助,那么您就不需要向审稿人解释。如果需要,还可以比较一些插值技术,请参阅我写的报告提示。

数据投影



我看到您正在使用经纬度坐标来处理克里金。 Edzer Pebesma( gstat 的作者)指出,没有适合纬度坐标的变差函数模型。这是因为在纬度经度的距离不是直的(即欧几里德),但在一个球体, (即大圆距)。没有对球面坐标有效的协方差函数(或变差函数模型)。我建议在使用automap之前使用 rgdal 包中的 spTransform 来预测它们。

rgdal包使用 proj4投影库执行计算。项目数据,你首先需要定义它的投影:

  proj4string(DF)=+凸出= longlat + ellps = WGS84 + datum = WGS84 + no_defs

上面表达式右边的proj4字符串定义了类型的投影( + proj ),使用的椭圆( + ellps )和数据( +基准)。要理解这些术语的含义,你必须将地球想象成马铃薯。地球不是完美的球形,这是由椭圆定义的。地球也不是完美的椭球体,但表面更不规则。这种不规则性是由数据定义的。另请参阅维基百科上的这篇文章

一旦您定义了投影,您可以使用 spTransform

  project_df = spTransform(DF,CRS( + PROJ = etcetc))

其中CRS(+ proj etc)定义了目标投影。

使用ggplot2绘图



使用ggplot2绘图为了向ggplot添加多边形或多段线,请查看 coord_map 的文档。这包括使用 maps 包打印国家边界的示例。如果你需要为你的学习区域加载形状文件,你可以使用 rgdal 来完成。请记住 ggplot2 可以处理data.frame,而不是 SpatialPolygons 。您可以使用
> SpatialPolygons < p $ p> poly_df = fortify(poly_Spatial)

另请参阅这个功能我创建绘制空间网格。它直接在SpatialGrids / Pixels上工作。请注意,您需要从该存储库获取一个或两个其他文件( continuousToDiscrete )。

创建插值网格



创建了一个automap来在没有指定输出网格时生成输出网格。这是通过在数据点周围创建一个凸包完成的,并在其内部采样5000个点。预测区域的边界以及采样点的数量(以及分辨率)是非常随意的。对于特定的应用,可以使用 spsample 来对多边形内的点进行采样,从而可以从多边形导出预测区域的形状。有多少点要取样,因此分辨率取决于两件事情:


  • 您拥有的数据类型,例如,如果您的数据非常流畅,与此平滑度相比,提高分辨率并没有太大意义。或者,如果您的数据具有许多小规模的结构,则需要高分辨率。如果您的观察结果支持这种高分辨率,这是唯一可能的。

  • 数据密度。如果你的数据更密集,你可以提高分辨率。



如果您使用插值图进行后续分析,则正确获得分辨率非常重要。如果你纯粹为了视觉目的而使用地图,这不太重要。但请注意,在这两种情况下,分辨率过高都可能会误导您的预测的准确性,分辨率太低并不能正确处理数据。

I have survey data of species richness that was taken at various sites in the Chesapeake Bay, USA, and I would like to graphically present the data as a "heat map."

I have a dataframe of lat/long coordinates and richness values, which I converted into a SpatialPointsDataFrame and used the autoKrige() function from the automap package to generate the interpolated values.

First, can anyone comment as to whether I am correctly implementing the autoKrige() function?

Second, I am having trouble plotting the data and overlaying a map of the region. Alternately, could I specify the interpolation grid to reflect the borders of the Bay (as suggested here)? Any thoughts on how I might do that and where I might get that information? Supplying the grid to autoKrige() appears easy enough.


EDIT: Thanks to Paul for his super helpful post! Here is what I have now. Having trouble getting ggplot to accept both the interpolated data and the map projection:

require(rgdal)
require(automap)
#Generate lat/long coordinates and richness data
set.seed(6)
df=data.frame(
  lat=sample(seq(36.9,39.3,by=0.01),100,rep=T),
  long=sample(seq(-76.5,-76,by=0.01),100,rep=T),
  fd=runif(10,0,10))
initial.df=df

#Convert dataframe into SpatialPointsDataFrame
coordinates(df)=~long+lat

#Project latlong coordinates onto an ellipse
proj4string(df)="+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
#+proj = the type of projection (lat/long)
#+ellps and +datum = the irregularity in the ellipse represented by planet earth

#Transform the projection into Euclidean distances
project_df=spTransform(df, CRS("+proj=merc +zone=18s +ellps=WGS84 +datum=WGS84")) #projInfo(type="proj")

#Perform the interpolation using kriging
kr=autoKrige(fd~1,project_df)
#Extract the output and convert to dataframe for easy plotting with ggplot2
kr.output=as.data.frame(kr$krige_output)
#Plot the output
#Load the map data for the Chesapeake Bay
cb=data.frame(map("state",xlim=range(initial.df$long),ylim=range(initial.df$lat),plot=F)[c("x","y")])

ggplot()+
  geom_tile(data=kr.output,aes(x=x1,y=x2,fill=var1.pred))+  
  geom_path(data=cb,aes(x=x,y=y))+
  coord_map(projection="mercator")

解决方案

I have a number of remarks on your post:

Using kriging

I see that you are using geostatistics to construct your heatmap. You could also consider other interpolation techniques such as splines (e.g. Thin plate splines in the fields package). These make less assumptions about the data (e.g. stationarity), and can also visualize your data just fine. The reduction in the number of assumptions might help in case you send it to a journal, then you have less to explain to the reviewers. You can also compare a few interpolation techniques if you want, see a report I wrote for some tips.

Data projection

I see that you are using lat long coordinates for kriging. Edzer Pebesma (author of gstat) remarked that there are no variogram models that are suitable for lat lon coordinates. This is because in lat lon the distances are not straight (i.e. Euclidean), but over a sphere, (i.e. Great circle distances). There are no covariance functions (or variogram models) that are valid for spherical coordinates. I recommend projecting them using spTransform from the rgdal package before using automap.

The rgdal package uses the proj4 projection library to perform the calculations. To project your data you first need to define its projection:

proj4string(df) = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"

The proj4 string on the right hand side of the expression above defines the type of projection (+proj), the ellips that was used (+ellps) and the datum (+datum). To understand what these terms mean, you have to imagine the Earth as a potato. The Earth is not perfectly spherical, this is defined by the ellips. Neither is the Earth a perfect ellipsoid, but the surface is more irregular. This irregularity is defined by the datum. See also this article on Wikipedia.

Once you have the projection defined, you can use spTransform:

project_df = spTransform(df, CRS("+proj= etcetc"))

where CRS("+proj etc") defines the target projection. Which projection is appropriate depends on your geographical location and the size of your study area.

Plotting with ggplot2

For adding polygons or polylines to ggplot, please to a look the documentation of coord_map. This includes an example of using the maps package to plot country boundaries. If you need to load for example shapefiles for your study area, you can do so using rgdal. Do remember that ggplot2 works with data.frame's, not SpatialPolygons. You can transform SpatialPolygons to data.frame using:

poly_df = fortify(poly_Spatial)

See also this function I created to plot spatial grids. It works directly on SpatialGrids/Pixels. Note that you need to source one or two additional files from that repository (continuousToDiscrete).

Creating interpolation grid

I created automap to generate an output grid when none was specified. This is done by creating a convex hull around the data points, and sampling 5000 points inside it. The boundaries of the prediction area, and the number of points sampled in it (and thus the resolution) is quite arbitrary. For a specific application the shape of the prediction area can be derived from a polygon, using spsample to sample points inside the polygon. How many points to sample, and thus the resolution, depends on two things:

  • the kind of data you have, For example, if your data is very smooth, there is not much point in raising the resolution really high in comparison to this smoothness. Alternatively, if your data has many small scale strcutures, you need a high resolution. This is only possible ofcourse if you have the observations to support this high resolution.
  • the density of data. If your data is more dense, you can raise the resolution.

If you use your interpolated map for subsequent analyses, getting the resolution right is important. If you use the map purely for visuatlisation purposes, this is less important. Note however that in both cases a too high resolution can be misleading as to the accuracy of your predictions, and that a too low resolution does not do justice to the data.

这篇关于在地图上绘制插值数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆