如何在 R 中产生网格输出并消除不在陆地上的网格方块? [英] How to produce gridded output in R and eliminate grid squares that are not over land?

查看:54
本文介绍了如何在 R 中产生网格输出并消除不在陆地上的网格方块?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用薄板样条算法生成英国的网格降雨数据,并消除 R 中未覆盖的值 - 到目前为止我只能手动实现该过程.这个问题(对我来说)很有挑战性,甚至很难解释——所以我将逐步完成我迄今为止所做的.非常欢迎任何帮助.

I am trying to produce gridded rainfall data over the UK by using a Thin Plate Spline algorithm and eliminate values that are not over land in R - a process I can only achieve so far manually. The problem is challenging (for me) and even challenging to explain - so I will step through what I have done so far. Any help will be greatly welcome.

首先,我将一个数据表加载到 R 中,该表表示来自多个点位置气象站的单日降雨量,数据表的每一行都包含日期、站的 id、东向和北向站、该站点的日降雨量和当年的平均降雨量.我还加载了库字段、maptools 和 gstat.

First, I load a data table into R that represents rainfall on a single day from a number of point location weather stations, and each row of the data table contains the date, id of the station, the easting and northing of the station, the daily rainfall at that site and the average rainfall for the year. I also load the libraries fields,maptools and gstat.

library(fields)
library(maptools)
library(gstat)

dat <- read.table("1961month1day1.csv", header=T, sep=",", quote = "")
names(dat) <- c("easting", "northing", "dailyrainfall","avaerageyearlyrainfall")

以下是数据示例:

dput(head(dat, 20))
structure(list(easting = c(130000L, 145000L, 155000L, 170000L, 
180000L, 180000L, 180000L, 180000L, 185000L, 200000L, 200000L, 
205000L, 210000L, 220000L, 225000L, 230000L, 230000L, 230000L, 
230000L, 235000L), northing = c(660000L, 30000L, 735000L, 40000L, 
30000L, 45000L, 60000L, 750000L, 725000L, 50000L, 845000L, 65000L, 
770000L, 105000L, 670000L, 100000L, 620000L, 680000L, 95000L, 
120000L), dailyrainfall = c(9.4, 4.1, 12.4, 2.8, 1.3, 3.6, 4.8, 26.7, 19.8, 
4.6, 1.7, 4.1, 12.7, 1.8, 3, 5.3, 1, 1.5, 1.5, 4.6), averageyearlyrainfall = c(1334.626923, 
1123.051923, 2072.030769, 1207.584615, 928, 1089.334615, 880.0884615, 
2810.323077, 1933.719231, 1215.642308, 2644.171154, 1235.913462, 
2140.111538, 1010.436538, 1778.432692, 1116.934615, 912.2807692, 
1579.386538, 1085.498077, 1250.601923)), .Names = c("easting", 
"northing", "dailyrainfall", "averageyearlyrainfall"), row.names = c(NA, 20L), class = "data.frame")

然后,我可以将薄板样条拟合到数据上,以便为我提供网格曲面并绘制曲面:

I can then fit a thin plate spline to the data so as to give me a gridded surface and plot the surface:

fit <- Tps(cbind(dat$easting,dat$northing),dat$dailyrainfall)
surface(fit)

然后我可以使用以下方法以 1 公里的步长创建英国的网格:

I can then create a grid of the UK, in 1km steps by using:

xvals <- seq(0, 700000, by=1000)
yvals <- seq(0, 1250000, by=1000)

然后将表面绘制到此网格上并将数据写入表格:

and then plot the surface onto this grid and write the data into a table:

griddf <- expand.grid(xvals, yvals)
griddf$pred <- predict(fit, x=as.matrix(griddf))
write.table(griddf, file="1Jan1961grid.csv", sep=",", qmethod="double")

很棒 - 到目前为止一切都很好.我现在已经在整个 0 到 700000 (E) 和 0 到 1250000 (N) 网格上将我的点数据转换为 1km 网格数据.写入的数据表是一个包含索引、东距、北距和预测降雨值的列表.

Great - so far so good. I now have converted my point data to 1km gridded data over the entire 0 to 700000 (E) and 0 to 1250000 (N) grid. The written data table is a list containing an index, an easting, a northing and the predicted rainfall value.

现在的挑战 - 我想从这个列表中消除任何不超过土地的值.我可以通过将数据加载到 excel(或 Access)并将数据与包含相同网格和年平均降雨量的另一个文件(该文件称为 1kmgridaveragerainfall.csv)进行比较来手动实现这一点.这是此文件的示例:

Now the challenge - I want to eliminate any values from this list that are not over land. I can achieve this manually by loading the data into excel (or Access) and comparing the data to another file that contains the same grid and the average yearly rainfall (the file is called 1kmgridaveragerainfall.csv). Here is a sample of this file:

dput(head(dat1, 20))
structure(list(easting = c(-200000L, -200000L, -200000L, -200000L, 
-200000L, -200000L, -200000L, -200000L, -200000L, -200000L, -200000L, 
-200000L, -200000L, -200000L, -200000L, -200000L, -200000L, -200000L, 
-200000L, -200000L), northing = c(1245000L, 1240000L, 1235000L, 
 1230000L, 1225000L, 1220000L, 1215000L, 1210000L, 1205000L, 1200000L, 
 1195000L, 1190000L, 1185000L, 1180000L, 1175000L, 1170000L, 1165000L, 
 1160000L, 1155000L, 1150000L), averageyearlyrainfall = c(-9999, -9999, -9999, 
 -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, 
 -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999)), .Names = c("easting", 
 "northing", "averageyearlyrainfall"), row.names = c(NA, 20L), class = "data.frame")

任何不在陆地上的方格的年平均降雨量为 -9999.因此,一旦匹配(即使用 vlookup 或 Access 中的查询),我可以过滤掉具有此 -9999 值的值,这给我留下了一个数据表,其中包含仅土地价值的东向和北向以及日降雨量和平均年降雨量.然后我可以将它加载回 R 并使用以下方法绘制它:

Any grid square that is not over land has an average yearly rainfall of -9999. Hence once matched (i.e. using vlookup or a query in Access) I can filter out values that have this -9999 value and this leaves me with a data table that has the easting and northing and daily rainfall and average yearly rainfall for land values only. I can then load this back into R and plot this using:

quilt.plot(cbind(dat$easting,dat$northing),dat$mm, add.legend=TRUE, nx=654, ny=1209,xlim=c(0,700000),ylim=c(0,1200000))

我在英国土地(而不是海域)上留下了一块降雨.

and I am left with a plot of rainfall over the UK land (and not the sea area).

那么,任何人都可以提出一种方法来实现相同但没有使用excel或访问的所有过滤等,即只能使用R来实现吗?有没有办法在开始时将两个数据表加载到 R 中,并以某种方式将点数据的 TPS 拟合到平均数据上,以便不绘制等于 -9999 的网格方块.

So, can anybody suggest a way of achieving the same but without all the filtering etc using excel or access i.e. can the same be achieved using R only? Is there a way of loading both data tables into R at the beginning and somehow fitting the TPS of the point data over the average data so that grid squares that are equal to -9999 are not plotted.

我知道可以使用协变量 (Z) 对 TPS 进行加权 - 这有帮助吗?即

I know that the TPS can be weighted using a covariate (Z) - does this help at all? i.e.

fit <- Tps(cbind(dat$easting,dat$northing),dat$dailyrainfall, Z=dat$averageyearlyrainfall)

此外,当我执行原始 TPS 的表面(拟合)时,我如何将绘图扩展到绘图的边缘 - 我确定我已经在某处读过这篇文章,你在其中放置了诸如 interp=TRUE 之类的东西,但是这个不起作用.

Also, when I perform surface(fit) of the original TPS, how do I extend the plot to the edges of the plot - I'm sure I've read this somewhere where you put something like interp=TRUE but this doesn't work.

任何帮助将不胜感激

谢谢,托尼

推荐答案

如果您已经拥有两个数据帧,您应该能够将它们合并到一个新的数据帧中并过滤/子集结果.

If you have already got to the point where you have the two dataframes you should be able to merge them into a new dataframe and filter/subset the result.

set.seed(1234) # for reproducibility

# "The written data table is a list containing an index, an easting,
# a northing and the predicted rainfall value"
# Create a simple data frame containing made-up data
mydf1 <- data.frame(index = 1:10,
                    easting = c(1, 1, 3, 4, 5, 5, 5, 5, 6, 6),
                    northing = c(12, 13, 13, 13, 14, 14, 15, 17, 18, 20),
                    predicted = runif(10, 500, 1000))

# "...comparing the data to another file that contains the same grid
# and the average yearly rainfall"
# Second data frame is similar, but has rainfall instead of predicted
mydf2 <- data.frame(index = 1:10,
                    easting = c(1, 1, 3, 4, 5, 5, 5, 5, 6, 6),
                    northing = c(12, 13, 13, 13, 14, 14, 15, 17, 18, 20),
                    rainfall = c(runif(9, 500, 1000), -9999))

# If data frames are of same size and have mostly common columns,
# merging them probably makes it easy to manipulate the data
mydf.merged <- merge(mydf1, mydf2)

# Finally, filter the merged data frame so that it only contains
# rainfall values that are not the -9999 value that denotes sea
mydf.final <- mydf.merged[mydf.merged$rainfall > -9999, ]

这是第一个数据框:

> mydf1
   index easting northing predicted
1      1       1       12  556.8517
2      2       1       13  811.1497
3      3       3       13  804.6374
4      4       4       13  811.6897
5      5       5       14  930.4577
6      6       5       14  820.1553
7      7       5       15  504.7479
8      8       5       17  616.2753
9      9       6       18  833.0419
10    10       6       20  757.1256
> 

这是第二个数据框:

> mydf2
   index easting northing   rainfall
1      1       1       12   846.7956
2      2       1       13   772.4874
3      3       3       13   641.3668
4      4       4       13   961.7167
5      5       5       14   646.1579
6      6       5       14   918.6478
7      7       5       15   643.1116
8      8       5       17   633.4104
9      9       6       18   593.3614
10    10       6       20 -9999.0000
> 

合并数据框:

> mydf.merged
   index easting northing predicted   rainfall
1      1       1       12  556.8517   846.7956
2     10       6       20  757.1256 -9999.0000
3      2       1       13  811.1497   772.4874
4      3       3       13  804.6374   641.3668
5      4       4       13  811.6897   961.7167
6      5       5       14  930.4577   646.1579
7      6       5       14  820.1553   918.6478
8      7       5       15  504.7479   643.1116
9      8       5       17  616.2753   633.4104
10     9       6       18  833.0419   593.3614
> 

删除了 -9999 行的最终数据框:

Final dataframe with -9999 row removed:

> mydf.final
   index easting northing predicted rainfall
1      1       1       12  556.8517 846.7956
3      2       1       13  811.1497 772.4874
4      3       3       13  804.6374 641.3668
5      4       4       13  811.6897 961.7167
6      5       5       14  930.4577 646.1579
7      6       5       14  820.1553 918.6478
8      7       5       15  504.7479 643.1116
9      8       5       17  616.2753 633.4104
10     9       6       18  833.0419 593.3614
> 

这篇关于如何在 R 中产生网格输出并消除不在陆地上的网格方块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆