如何在R的轮廓中找到点? [英] How to find points within contours in R?

查看:57
本文介绍了如何在R的轮廓中找到点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我研究了

多级"的原因是给定层中的每个多边形是分开的,因此每级可能有> 1个:

  cl<-outlineLines(dens,level = levels)sort(table(sapply(cl,`[[`,"level")))## 0.00519851181336958 0.00765971436995347 0.0107843979424224 0.0128423136194731## 2 3 3 3 

因此,在计算每个多边形的点数时只需考虑这一点:

  setNames(lapply(cl,function(poly)sum(sp :: point.in.polygon(xdf $ x,xdf $ y,poly $ x,poly $ y))),sapply(cl,`[[`,"level"))->level_ctsstr(level_cts)## 11的清单## $ 0.00519851181336958:整数91## $ 0.00519851181336958:整数174## $ 0.00765971436995347:整数78## $ 0.00765971436995347:整数57## $ 0.00765971436995347:整数74## $ 0.0107843979424224:整数65## $ 0.0107843979424224:整数34## $ 0.0107843979424224:整数33## $ 0.0128423136194731:整数42## $ 0.0128423136194731:整数10## $ 0.0128423136194731:整数3 

然后我们可以总结一下:

  sapply(split(level_cts,名称(level_cts)),功能(级别)sum(取消列表(级别)))->pt_ctspt_cts## 0.00519851181336958 0.00765971436995347## 265209## 0.0107843979424224 0.0128423136194731## 132 55 

然后,获取%:

  pt_cts/nrow(xdf)## 0.00519851181336958 0.00765971436995347## 0.8862876 0.6989967## 0.0107843979424224 0.0128423136194731## 0.4414716 0.1839465 

更新

我们不仅可以计算百分比,还可以将级别分配给原始数据:

  do.call(rbind.data.frame,lapply(cl,function(poly){#遍历每个多边形#找出此polgyon中的哪些ptwhich_pts<-as.logical(sp :: point.in.polygon(xdf $ x,xdf $ y,poly $ x,poly $ y))tdf<-xdf [which_pts,]#将它们分配给临时数据帧tdf $ level<-poly $ level#添加级别tdf}))->new_xdfdplyr :: glimpse(new_xdf)##观察值:661##变量:3## $ x< dbl>2.000000,2.033333,1.833333,1.616667,1.766667,2.0000 ...## $ y< dbl>77,77,81,89,73,83,84,85,79,75,91,87,86,78,...## $ level< dbl>0.005198512、0.005198512、0.005198512、0.005198512、0.00 ...#,虽然您可能需要级别值,但这会添加级别#和&的列.概率new_xdf $ level_num<-as.integer(factor(new_xdf $ level,levels,labels = 1:length(levels)))new_xdf $ prob<-as.numeric(as.character(factor(new_xdf $ level,level,labels = prob)))dplyr :: glimpse(new_xdf)##观察值:661##变量:5## $ x< dbl>2.000000、2.033333、1.833333、1.616667、1.766667、2 ....## $ y< dbl>77,77,81,89,73,83,84,85,79,75,91,87,86,...## $ level< dbl>0.005198512、0.005198512、0.005198512、0.005198512,...## $ level_num< int>1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 ...## $ prob< dbl>0.7、0.7、0.7、0.7、0.7、0.7、0.7、0.7、0.7、0.7、0 ....dplyr :: count(new_xdf,level,level_num,prob)###小动作:4 x 4## level level_num概率n##< dbl>< int>< dbl>< int>## 1 0.00520 1 0.700 265## 2 0.00766 2 0.500 209## 3 0.0108 3 0.250 132## 4 0.0128 4 0.100 55 

I've studied this question and created my own 4-contour map based on several thousands of pairs of longitude and latitude points, but I'm not getting the correct number of points inside each of the 4 contours using the points.in.polygon method mentioned in the above question.

Here is the code so far using MASS library:

 # use kde2d function to create kernel density estimates 
x <- pedestrian.df$longitude
y <- pedestrian.df$latitude
dens <- kde2d(x, y, n=200)

# create the contours to plot - 70%, 50%, 25%, 10% of density contained in each contour 
prob <- c(0.7, 0.5, 0.25, 0.1)
dx <- diff(dens$x[1:4])
dy <- diff(dens$y[1:4])
sz <- sort(dens$z)
c1 <- cumsum(sz) * dx * dy 
levels <- sapply(prob, function(x) { 
    approx(c1, sz, xout = 1 - x)$y
})

#create the contour plot using smoothScatter which smooths the collisions into kernel densities 

smoothScatter(x,y) + contour(dens, levels=levels, labels=prob, col = c("green", "yellow", "orange", "red"), lwd = 1.5, add=T)

This correctly generates the what I expected:

I then tried to use the points.in.polygon function from sp library as in the answer to the above linked question:

ls <- contourLines(dens, level=levels)
zone_1 <- point.in.polygon(df$longitude, df$latitude, ls[[4]]$x, ls[[4]]$y)
zone_2 <- point.in.polygon(df$longitude, df$latitude, ls[[3]]$x, ls[[3]]$y)
zone_3 <- point.in.polygon(df$longitude, df$latitude, ls[[2]]$x, ls[[2]]$y)
zone_4 <- point.in.polygon(df$longitude, df$latitude, ls[[1]]$x, ls[[1]]$y)

But this results in incorrect number of points per zone or contour. I know this is not right because each contour should have progressively more points as the contour gets larger.

I tried looking at ls (a list that stores a list of all the x and y coordinates of the polygons), but there are 15 levels, not the 4 I intuitively would've thought would be there. There are even multiple levels among the 15 that have the same value. I suspect the answer to my issue lies in subsetting this list of lists correctly to include the 4 levels that correspond to my 4 contours, but ls[[1:7]]$x, ls[[1:7]]$y doesn't work.

Thanks for any help and let me know if I could clarify anything!

解决方案

I think pedestrian is your own data vs something in pkg and since it's not part of the question, we'll use a different one:

library(MASS)
library(sp)

attach(geyser)

data.frame(
  x = geyser$duration,
  y = geyser$waiting
) -> xdf

dens <- kde2d(xdf$x, xdf$y, n = 100)

prob <- c(0.7, 0.5, 0.25, 0.1)

dx <- diff(dens$x[1:4])
dy <- diff(dens$y[1:4])
sz <- sort(dens$z)
c1 <- cumsum(sz) * dx * dy 

levels <- sapply(prob, function(x) { 
  approx(c1, sz, xout = 1 - x)$y
})

smoothScatter(x,y) +
  contour(dens, levels=levels, labels=prob, col = c("green", "yellow", "orange", "red"), lwd = 1.5, add=TRUE)

The reason for 'multiple levels' is that each polygon in a given layer is separate so there are potentially > 1 per-level:

cl <- contourLines(dens, level=levels)

sort(table(sapply(cl, `[[`, "level")))
## 0.00519851181336958 0.00765971436995347  0.0107843979424224  0.0128423136194731 
##                   2                   3                   3                   3

So, just account for that when calculating points-per-polygon:

setNames(
  lapply(cl, function(poly) sum(sp::point.in.polygon(xdf$x, xdf$y, poly$x, poly$y))),
  sapply(cl, `[[`, "level")
) -> level_cts

str(level_cts)
## List of 11
##  $ 0.00519851181336958: int 91
##  $ 0.00519851181336958: int 174
##  $ 0.00765971436995347: int 78
##  $ 0.00765971436995347: int 57
##  $ 0.00765971436995347: int 74
##  $ 0.0107843979424224 : int 65
##  $ 0.0107843979424224 : int 34
##  $ 0.0107843979424224 : int 33
##  $ 0.0128423136194731 : int 42
##  $ 0.0128423136194731 : int 10
##  $ 0.0128423136194731 : int 3

Then we can sum them up:

sapply(
  split(level_cts, names(level_cts)),
  function(level) sum(unlist(level))
) -> pt_cts

pt_cts

## 0.00519851181336958 0.00765971436995347 
##                 265                 209 
##  0.0107843979424224  0.0128423136194731 
##                 132                  55 

And, get %:

pt_cts / nrow(xdf)
## 0.00519851181336958 0.00765971436995347 
##           0.8862876           0.6989967 
##  0.0107843979424224  0.0128423136194731 
##           0.4414716           0.1839465 

UPDATE

Rather than just compute percentages, we can also assign the level to the original data:

do.call(
  rbind.data.frame,
  lapply(cl, function(poly) { # iterate over each polygon
    # figure out which pts are in this polgyon
    which_pts <- as.logical(sp::point.in.polygon(xdf$x, xdf$y, poly$x, poly$y))
    tdf <- xdf[which_pts,] # assign them to a temp data frame
    tdf$level <- poly$level # add the level
    tdf
  })
) -> new_xdf

dplyr::glimpse(new_xdf)
## Observations: 661
## Variables: 3
## $ x     <dbl> 2.000000, 2.033333, 1.833333, 1.616667, 1.766667, 2.0000...
## $ y     <dbl> 77, 77, 81, 89, 73, 83, 84, 85, 79, 75, 91, 87, 86, 78, ...
## $ level <dbl> 0.005198512, 0.005198512, 0.005198512, 0.005198512, 0.00...

# while you likely want the level value, this adds columns for level # & prob
new_xdf$level_num <- as.integer(factor(new_xdf$level, levels, labels=1:length(levels)))
new_xdf$prob <- as.numeric(as.character(factor(new_xdf$level, levels, labels=prob)))

dplyr::glimpse(new_xdf)
## Observations: 661
## Variables: 5
## $ x         <dbl> 2.000000, 2.033333, 1.833333, 1.616667, 1.766667, 2....
## $ y         <dbl> 77, 77, 81, 89, 73, 83, 84, 85, 79, 75, 91, 87, 86, ...
## $ level     <dbl> 0.005198512, 0.005198512, 0.005198512, 0.005198512, ...
## $ level_num <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ prob      <dbl> 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0....

dplyr::count(new_xdf, level, level_num, prob)
## # A tibble: 4 x 4
##     level level_num  prob     n
##     <dbl>     <int> <dbl> <int>
## 1 0.00520         1 0.700   265
## 2 0.00766         2 0.500   209
## 3 0.0108          3 0.250   132
## 4 0.0128          4 0.100    55

这篇关于如何在R的轮廓中找到点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆