确定给定的经纬度是否属于多边形 [英] Determine if a given lat-lon belong to a polygon

查看:155
本文介绍了确定给定的经纬度是否属于多边形的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个名为zone的数据文件,其中1994字符串的2D坐标表示多边形顶点的坐标,如下所示(每行RHS的第一个数字表示zone)

Assume I have a data file called zone with 1994 strings of 2D coordinators denoting coordinates of vertices of polygons like the following (the very first number on the RHS of each line denotes the zone)

c1 <- "1", "1 21, 31 50, 45 65, 75 80"

c2 <- "2", "3 20, 5 15, 2 26, 70 -85, 40 50, 60 80"

.....

c1993 <- "1993", "3 2, 2 -5, 0 60, 7 -58, -12 23, 56 611, 85 152"

c1994 <- "1994", "30 200, 50 -15, 20 260, 700 -850, -1 2, 5 6, 8 15"

现在,我想以一种随机的lat-lon对(假设1220)对这些字符串进行操作,我可以比较一下它是否属于第一个多边形,第二个多边形,第3个多边形,.....或第1994个多边形. 蛮力解决方案是:将x-coordinate(= 12)与所有4 x坐标和y-coordinate(= 20) to all the 4 y c1 and c2 , respectively. The conclusion would be whether there is a valid **sandwich** inequality for each given coordinate x and y`.

Now I want to manipulate these strings in such a way that given a random pair of lat-lon (let's say 12 and 20), I could compare to see if it falls into first polygon, second polygon, 3rd polygon,.... or 1994th polygon. The brute-force solution is: compare the x-coordinate (= 12) to all the 4 x-coordinates and y-coordinate(= 20) to all the4y-coordinates inc1andc2, respectively. The conclusion would be whether there is a valid **sandwich** inequality for each given coordinatexandy`.

例如,通过使用如上所述的求解过程,点(12,20)将在c1中,但不在c2中.

For example, by using the solution process as above, the point (12,20) will be in c1 but not c2.

我的问题:我如何在R中实现这一目标?

My question: How could I achieve this goal in R?

我的尝试:由于史蒂芬·洛朗(StéphaneLaurent)的帮助,我能够生成所有具有一定大小的矩阵,这些矩阵存储每个多边形的所有顶点的lat-lon对,并包含以下内容:代码:

My attempt: Thanks to Stéphane Laurent's help, I was able to generate all the matrices, each with certain sizes, that store the lat-lon pairs of all the vertices of each polygon with the following code:

 zone <- read_delim("[directory path to zone.csv file]", delim = ",", col_names = TRUE)
for(i in 1:nrow(zone)){
  zone$geo[i] = substr(zone$geo[i],10,135)
}
zone <- zone[complete.cases(zone),]

 Numextract <- function(string){
    unlist(regmatches(string, gregexpr("[[:digit:]]+\\.*[[:digit:]]*", string)))
 }

for(i in 1:nrow(zone)){
        poly1 <- matrix(as.numeric(Numextract(zone$geo[i])),i, ncol=2, byrow=TRUE)
        poly2 <- cbind(poly1, c(i))
}

但是,正如您可能看到的那样,我需要找到一种方法来索引与在for()循环期间生成的每个区域相对应的每个矩阵.原因是因为之后,我可以使用另一个for()循环来确定一个点属于哪个区域!但是我无法弄清楚这一点,所以有人可以帮我提供详细的代码吗?

However, as you might see, I need to find a way to index every matrices corresponding to each zone that were generated during the for() loop. The reason is because afterwards, I can use another for() loop to determine which zone a point belongs to!! But I have not been able to figure this out, so can anyone please help me with a detailed code?

实际数据集
区域和多边形数据集

Lat-Lon对数据集

推荐答案

首先,将多边形定义为矩阵,每一行代表一个顶点:

First, define your polygons as matrices, each row representing a vertex:

poly1 <- rbind(c(1,21), c(31,50), c(45,65), c(75,80))
poly2 <- rbind(c(3,20), c(5,15), c(2,26), c(70,-85))

定义要测试的点:

point <- c(12,20)

现在,使用ptinpoly软件包的pip2d功能:

Now, use the pip2d function of the ptinpoly package:

> library(ptinpoly)
> pip2d(poly1, rbind(point))
[1] -1
> pip2d(poly2, rbind(point))
[1] 1

这意味着(请参​​阅?pip2d)该点位于poly1外部和poly2内部​​.

That means (see ?pip2d) that the point is outside poly1 and inside poly2.

请注意pip2d中的rbind(point).我们使用rbind是因为我们可以更普遍地对同一个多边形中的多个点进行测试.

Note the rbind(point) in pip2d. We use rbind because we can more generally run the test for several points in a same polygon.

如果您需要帮助进行转换

If you need help to convert

c1 <- "1 21, 31 50, 45 65, 75 80"

poly1 <- rbind(c(1,21), c(31,50), c(45,65), c(75,80))

那也许你应该再问一个问题.

then maybe you should open another question.

好的,不要打开另一个问题.您可以按照以下步骤进行操作.

Ok, do not open another question. You can proceed as follows.

c1 <- "1 21, 31 50, 45 65, 75 80"

Numextract <- function(string){
  unlist(regmatches(string, gregexpr("[[:digit:]]+\\.*[[:digit:]]*", string)))
}

poly1 <- matrix(as.numeric(Numextract(c1)), ncol=2, byrow=TRUE)

哪个给:

> poly1
     [,1] [,2]
[1,]    1   21
[2,]   31   50
[3,]   45   65
[4,]   75   80

第二次编辑

对于第二个问题,您的数据太大.我能看到的唯一解决方案是将数据分成较小的部分.

2nd Edit

For your second problem, your data are too big. The only solution I can see is to split the data into smaller pieces.

但是首先,似乎pip2d函数也会导致R会话崩溃.因此,请使用另一个功能:软件包SDMTools中的pnt.in.poly.

But first of all, it seems that the pip2d function also causes the R session to crash. So use another function: pnt.in.poly from the package SDMTools.

这里是此功能的小修改,通过删除无用的输出使其更快:

Here is a small modification of this function, making it faster by removing useless outputs:

library(SDMTools)
pnt.in.poly2 <- function(pnts, poly.pnts){
  if (poly.pnts[1, 1] == poly.pnts[nrow(poly.pnts), 1] && 
      poly.pnts[1, 2] == poly.pnts[nrow(poly.pnts), 2]){ 
    poly.pnts = poly.pnts[-1, ]
  }
  out = .Call("pip", pnts[, 1], pnts[, 2], nrow(pnts), poly.pnts[,1], poly.pnts[, 2], nrow(poly.pnts), PACKAGE = "SDMTools")
  return(out)
}

现在,如前所述,将lat_lon切成小块,每块长100万(除了最后一个,更小):

Now, as said before, split lat_lon in smaller pieces, 1 million length each, (except the last one, smaller):

lat_lon_list <- vector("list", 70)
for(i in 1:69){
  lat_lon_list[[i]] = lat_lon[(1+(i-1)*1e6):(i*1e6),]
}
lat_lon_list[[70]] <- lat_lon[69000001:nrow(lat_lon),]

现在,运行以下代码:

library(data.table)
for(i in 1:70){
  DT <- data.table(V1 = pnt.in.poly2(lat_lon_list[[i]], polys[[1]]))
  for(j in 2:length(polys)){
    DT[, (sprintf("V%d",j)):=pnt.in.poly2(lat_lon_list[[i]], polys[[j]])]
  }
  fwrite(DT, sprintf("results%02d.csv", i))
  rm(DT)
}

如果可行,它将生成70个csv文件,result01.csv,...,result70.csv,每个文件的大小为1000000x1944(最后一个较小的除外),然后可以在Excel中打开它们.

If it works, it should generate 70 csv files, result01.csv, ..., result70.csv, each of size 1000000x1944 (except the last one, smaller), then it's possible to open them in Excel.

我尝试了代码,但出现错误:Error: cannot allocate vector of size 7.6 Mb.

I've tried the code and I've got an error: Error: cannot allocate vector of size 7.6 Mb.

我们需要更精细的划分:

We need a finer splitting:

lat_lon_list <- vector("list", 2*69+1)
for(i in 1:(2*69)){
  lat_lon_list[[i]] = lat_lon[(1+(i-1)*1e6/2):(i*1e6/2),]
}
lat_lon_list[[2*69+1]] <- lat_lon[69000001:nrow(lat_lon),]

for(i in 1:(2*69+1)){
  DT <- data.table(V1 = pnt.in.poly2(lat_lon_list[[i]], polys[[1]]))
  for(j in 2:length(polys)){
    DT[, (sprintf("V%d",j)):=pnt.in.poly2(lat_lon_list[[i]], polys[[j]])]
  }
  fwrite(DT, sprintf("results%02d.csv", i))
  rm(DT)
}

这篇关于确定给定的经纬度是否属于多边形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆