点数据集到网格数据集的平均值 [英] Average values of a point dataset to a grid dataset

查看:242
本文介绍了点数据集到网格数据集的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对ggplot比较陌生,所以请原谅我,如果我的一些问题真的很简单或根本无法解决。



我想要做的是生成一个填充形状连续的国家的热图。此外,我的国家形式为 .RData 。我使用了哈德利韦克汉姆的脚本来改变我的SpatialPolygon数据转换为数据框。现在我的数据框的long和lat数据看起来像这样

  head(my_df)
long lat group
6.527187 51.87055 0.1
6.531768 51.87206 0.1
6.541202 51.87656 0.1
6.553331 51.88271 0.1

这个长/纬度的数据绘制了德国的轮廓。数据框的其余部分在这里省略,因为我认为它不是必需的。对于特定的长/经点,我还有第二个数据框。这看起来像这样

  my_fixed_points 
long lat value
12.817 48.917 0.04
8.533 52.017 0.034
8.683 50.117 0.02
7.217 49.483 0.0542

现在我想要做的是根据位于该点某一距离内的所有固定点的平均值对地图的每个点进行着色。这样我会得到一个(几乎)持续的国家地图的色彩。
我到目前为止是用ggplot2绘制的国家地图

  ggplot(my_df,aes(long, lat))+ geom_polygon(aes(group = group),fill =white)+ 
geom_path(color =white,aes(group = group))+ coord_equal()

我的第一个想法是生成位于地图上的点,然后计算每个生成点的值<$ c

$ p $ lt; code> value_vector< - 子集(my_fixed_points,
spDistsN1 (cbind(my_fixed_points $ long,my_fixed_points $ lat),
c(my_generated_point $ long,my_generated_point $ lat),longlat = TRUE)< 50,
select = value)
point_value< mean(value_vector)

我还没有找到一种方法来生成这些点。就整个问题而言,我甚至不知道是否有可能以这种方式解决问题。我现在的问题是,是否存在产生这些点的方法和/或是否有另一种解决方法。

解决方案



感谢保罗,我几乎得到了我想要的。下面是一个荷兰样本数据的例子。

  library(ggplot2)
library(sp)
库(automap)
库(rgdal)
库(比例)

获取荷兰的空间数据
con < - url(http: //gadm.org/data/rda/NLD_adm0.RData)
print(load(con))
close(con)

#将它们转换为正确的格式autoKrige
gadm_t< - spTransform(gadm,CRS = CRS(+ proj = merc + ellps = WGS84))

#生成一些随机值作为固定点
value_points < - spsample(gadm_t,type =stratified,n = 200)
values < - data.frame(value = rnorm(dim(coordinates(value_points))[1],0,1) )
value_df< - SpatialPointsDataFrame(value_points,values)

#生成一个可以从固定点估计的网格
grd = spsample(gadm_t,type =regular ,n = 4000)
kr < - autoKrige(value_1,value_df,grd)
dat = as.data.frame(kr $ krige_output)

#draw the属ted带底层映射的网格
ggplot(gadm_t,aes(long,lat))+ geom_polygon(aes(group = group),fill =white)+ geom_path(color =white,aes(group =组))+ coord_equal()+
geom_tile(aes(x = x1,y = x2,fill = var1.pred),data = dat)+ scale_fill_continuous(low =white,high = muted(orange ),name =value)

解决方案

我想你想要的是这些东西线。我预测这个自制软件对于大型数据集来说效率会非常低,但它可以用于一个小例子数据集。我会研究内核密度,也可能是栅格包。但是,也许这适合你... ...

以下代码片段计算覆盖原始点数据集的点网格的镉浓度平均值。

 库(sp)
库(ggplot2)
loadMeuse ()

#生成一个网格在
上进行采样bb = bbox(meuse)
grd = spsample(meuse,type =regular,n = 4000)
#提出平均镉值
#所有点数<千米。
mn_value = sapply(1:长度(grd),函数(pt){
d = spDistsN1(meuse,grd [pt,])
return(mean(meuse [d <1000, ] $ cadmium))
))

#创建一个新对象
dat = data.frame(coordinates(grd),mn_value)
ggplot(aes(x = x1,y = x2,fill = mn_value),data = dat)+
geom_tile()+
scale_fill_continuous(low =white,high = muted(blue))+
coord_equal()

导致下面的图像:




另一种方法是使用插值算法。克里格就是一个例子。这很容易使用automap软件包(发现自我推销:),我写了这个软件包):

$ p $ library(automap)
kr = autoKrige(cadmium〜1,meuse,meuse.grid)
dat = as.data.frame(kr $ krige_output)

ggplot(aes(x = x, y = y,fill = var1.pred),data = dat)+
geom_tile()+
scale_fill_continuous(low =white,high = muted(blue))+
coord_equal()

导致下面的图像:




然而, ,但没有关于你的目标是什么的知识,我很难看到你想要的东西。


I am relatively new to ggplot, so please forgive me if some of my problems are really simple or not solvable at all.

What I am trying to do is generate a "Heat Map" of a country where the filling of the shape is continous. Furthermore I have the shape of the country as .RData. I used hadley wickham's script to transform my SpatialPolygon data into a data frame. The long and lat data of my data frame now looks like this

head(my_df)
long        lat         group
6.527187    51.87055    0.1 
6.531768    51.87206    0.1
6.541202    51.87656    0.1
6.553331    51.88271    0.1

This long/lat data draws the outline of Germany. The rest of the data frame is omitted here since I think it is not needed. I also have a second data frame of values for certain long/lat points. This looks like this

my_fixed_points
long        lat         value
12.817      48.917      0.04 
8.533       52.017      0.034
8.683       50.117      0.02
7.217       49.483      0.0542

What I would like to do now, is colour each point of the map according to an average value over all the fixed points that lie within a certain distance of that point. That way I would get a (almost)continous colouring of the whole map of the country. What I have so far is the map of the country plotted with ggplot2

ggplot(my_df,aes(long,lat)) + geom_polygon(aes(group=group), fill="white") + 
geom_path(color="white",aes(group=group)) + coord_equal()

My first Idea was to generate points that lie within the map that has been drawn and then calculate the value for every generated point my_generated_point like so

value_vector <- subset(my_fixed_points, 
  spDistsN1(cbind(my_fixed_points$long, my_fixed_points$lat),  
  c(my_generated_point$long, my_generated_point$lat), longlat=TRUE) < 50, 
  select = value)
point_value <- mean(value_vector)

I havent found a way to generate these points though. And as with the whole problem, I dont even know if it is possible to solve this way. My question now is if there exists a way to generate these points and/or if there is another way to come to a solution.

Solution

Thanks to Paul I almost got what I wanted. Here is an example with sample data for the Netherlands.

library(ggplot2)
library(sp)
library(automap)
library(rgdal)
library(scales)

#get the spatial data for the Netherlands
con <- url("http://gadm.org/data/rda/NLD_adm0.RData")
print(load(con))
close(con)

#transform them into the right format for autoKrige
gadm_t <- spTransform(gadm, CRS=CRS("+proj=merc +ellps=WGS84"))

#generate some random values that serve as fixed points
value_points <- spsample(gadm_t, type="stratified", n = 200)
values <- data.frame(value = rnorm(dim(coordinates(value_points))[1], 0 ,1))
value_df <- SpatialPointsDataFrame(value_points, values)

#generate a grid that can be estimated from the fixed points
grd = spsample(gadm_t, type = "regular", n = 4000)
kr <- autoKrige(value~1, value_df, grd)
dat = as.data.frame(kr$krige_output)

#draw the generated grid with the underlying map
ggplot(gadm_t,aes(long,lat)) + geom_polygon(aes(group=group), fill="white") + geom_path(color="white",aes(group=group)) + coord_equal() + 
geom_tile(aes(x = x1, y = x2, fill = var1.pred), data = dat) + scale_fill_continuous(low = "white", high = muted("orange"), name = "value")

解决方案

I think what you want is something along these lines. I predict that this homebrew is going to be terribly inefficient for large datasets, but it works on a small example dataset. I would look into kernel densities and maybe the raster package. But maybe this suits you well...

The following snippet of code calculates the mean value of cadmium concentration of a grid of points overlaying the original point dataset. Only points closer than 1000 m are considered.

library(sp)
library(ggplot2)
loadMeuse()

# Generate a grid to sample on
bb = bbox(meuse)
grd = spsample(meuse, type = "regular", n = 4000)
# Come up with mean cadmium value
# of all points < 1000m.
mn_value = sapply(1:length(grd), function(pt) {
  d = spDistsN1(meuse, grd[pt,])
  return(mean(meuse[d < 1000,]$cadmium))
})

# Make a new object
dat = data.frame(coordinates(grd), mn_value)
ggplot(aes(x = x1, y = x2, fill = mn_value), data = dat) + 
   geom_tile() + 
   scale_fill_continuous(low = "white", high = muted("blue")) + 
   coord_equal()

which leads to the following image:

An alternative approach is to use an interpolation algorithm. One example is kriging. This is quite easy using the automap package (spot the self promotion :), I wrote the package):

library(automap)
kr = autoKrige(cadmium~1, meuse, meuse.grid)
dat = as.data.frame(kr$krige_output)

ggplot(aes(x = x, y = y, fill = var1.pred), data = dat) + 
   geom_tile() + 
   scale_fill_continuous(low = "white", high = muted("blue")) + 
   coord_equal()

which leads to the following image:

However, without knowledge as to what your goal is with this map, it is hard for me to see what you want exactly.

这篇关于点数据集到网格数据集的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆