缓冲区重叠时计数点数 [英] Counting points when buffers are overlapping
本文介绍了缓冲区重叠时计数点数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在下面包含了我的所有代码和指向示例数据的链接。
简要说明:我的缓冲区重叠;我想计算离学校一定距离内的商店数量。
我特别想知道离学校1000米以内有多少家商店,离学校2000米以内有多少家商店,因为我想比较一下差异。当然,其中一些学校的缓冲是重叠的。因此,一家商店可能距离A校1500米,但距离B校只有750米。因此,它被算作距离一所学校1000米以内,应该只被计算在B校的1000米内,而不算在A校。如果一家商店距离两所学校2000米以内(但不在1000米以内),它需要计算到离它最近的学校。
因此,理想情况下,我希望数据集如下所示:
学校 | 存储1000M | 存储2000M |
---|---|---|
学校A | 3 | 6 |
B学校 | 2 | 7 |
所以我使用sf中的st_Union函数来组合缓冲区。这可以很好地生成漂亮的地图,但当我使用长度和st_intersects来计算缓冲区内的商店时,它只为每种类型的区域返回一个数字(1000m对2000m)
样本数据:Sample data
county.sf <- get_acs(state = "MO",
county = c( "St. Louis City"),
geography = "tract",
variables = "B03002_001",
output="wide",
geometry = TRUE) %>%
sf::st_transform(crs = "ESRI:102003")
class(county.sf)
# School data
school <- read.csv("C:\myfile1.csv")
school.sf <- st_as_sf(school, coords = c("long", "lat"), crs = "epsg:4326")
school.sf.utm <- st_transform(school.sf, crs = "ESRI:102003")
# Store data
store <- import("C:\myfile2.csv")
store.sf <- st_as_sf(store, coords = c("XCoord", "YCoord"), crs = "ESRI:102696")
store.sf.utm <- st_transform(store.sf, crs = "ESRI:102003")
elem.buff <-st_buffer(school.sf.utm, 1000)
elem.buff2 <-st_buffer(school.sf.utm, 2000)
pts_com<-st_union(elem.buff)
pts_pol<-st_cast(pts_com, "POLYGON")
pts_com2<-st_union(elem.buff2)
pts_pol2<-st_cast(pts_com2, "POLYGON")
#unmerged zone map
ex.map<- tm_shape(county.sf) +
tm_polygons() +
tm_shape(elem.buff) +
tm_borders(col="red") +
tm_shape(school.sf.utm) +
tm_dots(col = "red") +
tm_shape(elem.buff2) +
tm_borders(col="blue") +
tm_shape(pts_pol) +
tm_borders(col="black") +
tm_shape(store.sf.utm) +
tm_dots()
ex.map
#merged zones map
ex.map<- tm_shape(county.sf) +
tm_polygons() +
#(elem.buff) +
#tm_borders(col="red") +
tm_shape(school.sf.utm) +
tm_dots(col = "red") +
#tm_shape(elem.buff2) +
#tm_borders(col="blue") +
tm_shape(pts_pol) +
tm_borders(col="red") +
tm_shape(store.sf.utm) +
tm_dots() +
tm_shape(pts_pol2) +
tm_borders(col="blue")
ex.map
(school$pt_count <- lengths(st_intersects(elem.buff, store.sf.utm))) #gives per school but ignores overlapping
(school$pt_count <- lengths(st_intersects(pts_com, store.sf.utm)))
(school$pt_count <- lengths(st_intersects(elem.buff2, store.sf.utm)))
(school$pt_count <- lengths(st_intersects(pts_com2, store.sf.utm)))
推荐答案
针对更大的数据更新了更高效的答案:
我承认,以前的答案依赖于对所有学校进行list
列,并使用unnest()
找到每个组合,这不适用于更大的数据。
正如@JandraLacko在评论中所建议的,st_nearest_feature()
是您的朋友;毫不奇怪,它比我提出的‘手动’方法更有效。
如上所述,加载库和数据
library(readxl)
library(tidyverse)
library(sf)
library(tmap)
tmap_mode('view')
read_xlsx('Schools and Stores_all.xlsx', sheet = 1) %>%
st_as_sf(., coords = c("long", "lat"), crs = "epsg:4326") %>%
st_transform(crs = "ESRI:102003") %>%
{. ->> school.sf.utm}
read_xlsx('Schools and Stores_all.xlsx', sheet = 2) %>%
st_as_sf(., coords = c("XCoord", "YCoord"), crs = "ESRI:102696") %>%
st_transform(crs = "ESRI:102003") %>%
{. ->> store.sf.utm}
然后,我们使用st_join()
连接商店和学校数据,并指定join = st_nearest_feature
以便它连接每个商店最近的学校(的名称)。然后我们加入每个学派的几何,使用left_join()
。有关详细信息,请参阅?st_join
。因此,最终,这给了我们每个商店最近的学校。
# find the closest school to each store (this is the school it counts towards)
store.sf.utm %>%
rename(
store_geometry = geometry
) %>%
st_join(
school.sf.utm,
join = st_nearest_feature
) %>%
left_join(
school.sf.utm %>%
as_tibble %>%
rename(
school_geometry = geometry
)
) %>%
{. ->> all_combos}
all_combos
# Simple feature collection with 2603 features and 1 field
# Active geometry column: store_geometry
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 489948.3 ymin: 131719.1 xmax: 501438.8 ymax: 157382.1
# Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic
# # A tibble: 2,603 x 3
# store_geometry School school_geometry
# <POINT [m]> <chr> <POINT [m]>
# 1 (489948.3 137420.8) Community Access Job Training (491117.8 136616.5)
# 2 (490119.7 136712.7) Community Access Job Training (491117.8 136616.5)
# 3 (490171.8 138758.2) Gateway Science Acad/st Louis (491307.4 138787.2)
# 4 (490370.2 139681.3) Wilkinson Early Childhood Center (490930.4 140461.2)
# 5 (490568.3 137056.8) Community Access Job Training (491117.8 136616.5)
# 6 (490475 139013.4) Gateway Science Acad/st Louis (491307.4 138787.2)
# 7 (490527.6 139633.1) Wilkinson Early Childhood Center (490930.4 140461.2)
# 8 (490715.3 136690.1) Community Access Job Training (491117.8 136616.5)
# 9 (490552.5 139805.9) Wilkinson Early Childhood Center (490930.4 140461.2)
# 10 (490790 138069.5) Gateway Science Acad/st Louis (491307.4 138787.2)
# # ... with 2,593 more rows
有趣的是,我们的学校已经从156所减少到134所。我假设这意味着有22所学校不是离任何商店最近的。
# how many schools in all_combos?
all_combos %>%
summarise(
n_schools = n_distinct(School)
) %>%
pull(n_schools)
# [1] 134
现在我们知道了哪所学校是最近的,现在计算每个商店和它最近的学校之间的距离。
# calculate distance from each store to each school
all_combos %>%
mutate(
distance = as.numeric(st_distance(store_geometry, school_geometry, by_element = TRUE))
) %>%
filter(
distance <= 2000
) %>%
{. ->> all_combos_2}
all_combos_2
# Simple feature collection with 2595 features and 2 fields
# Active geometry column: store_geometry
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 489948.3 ymin: 131719.1 xmax: 501438.8 ymax: 152889.7
# Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic
# # A tibble: 2,595 x 4
# store_geometry School school_geometry distance
# * <POINT [m]> <chr> <POINT [m]> <dbl>
# 1 (489948.3 137420.8) Community Access Job Training (491117.8 136616.5) 1419.
# 2 (490119.7 136712.7) Community Access Job Training (491117.8 136616.5) 1003.
# 3 (490171.8 138758.2) Gateway Science Acad/st Louis (491307.4 138787.2) 1136.
# 4 (490370.2 139681.3) Wilkinson Early Childhood Center (490930.4 140461.2) 960.
# 5 (490568.3 137056.8) Community Access Job Training (491117.8 136616.5) 704.
# 6 (490475 139013.4) Gateway Science Acad/st Louis (491307.4 138787.2) 863.
# 7 (490527.6 139633.1) Wilkinson Early Childhood Center (490930.4 140461.2) 921.
# 8 (490715.3 136690.1) Community Access Job Training (491117.8 136616.5) 409.
# 9 (490552.5 139805.9) Wilkinson Early Childhood Center (490930.4 140461.2) 756.
# 10 (490790 138069.5) Gateway Science Acad/st Louis (491307.4 138787.2) 885.
# # ... with 2,585 more rows
算出离他们最近的学校1000米以内有多少家商店。
all_combos_2 %>%
filter(
distance <= 1000
) %>%
group_by(School) %>%
summarise(
Stores1000m = n()
) %>%
st_drop_geometry %>%
{. ->> combo_sum_1000}
combo_sum_1000
# # A tibble: 134 x 2
# School Stores1000m
# * <chr> <int>
# 1 Academy At Boys & Girls Town 24
# 2 AcademyOf Envt Sci/math Elementary School 18
# 3 AcademyOf Envt Sci/math Middle School 2
# 4 Adams Elementary School 12
# 5 Ames Visual/perf. Arts 25
# 6 Ashland Elementary And Br. 49
# 7 Aspire Academy 26
# 8 Beaumont Cte High School 46
# 9 Bishop DuBourg High School 4
# 10 Bryan Hill Elementary School 19
# # ... with 124 more rows
2000米也是如此。
all_combos_2 %>%
filter(
distance <= 2000
) %>%
group_by(School) %>%
summarise(
Stores2000m = n()
) %>%
st_drop_geometry %>%
{. ->> combo_sum_2000}
combo_sum_2000
# # A tibble: 134 x 2
# School Stores2000m
# * <chr> <int>
# 1 Academy At Boys & Girls Town 24
# 2 AcademyOf Envt Sci/math Elementary School 18
# 3 AcademyOf Envt Sci/math Middle School 2
# 4 Adams Elementary School 12
# 5 Ames Visual/perf. Arts 25
# 6 Ashland Elementary And Br. 49
# 7 Aspire Academy 28
# 8 Beaumont Cte High School 52
# 9 Bishop DuBourg High School 4
# 10 Bryan Hill Elementary School 19
# # ... with 124 more rows
然后将这两个表联接在一起。
combo_sum_1000 %>%
full_join(combo_sum_2000) %>%
{. ->> combo_sum_joined}
combo_sum_joined
# # A tibble: 134 x 3
# School Stores1000m Stores2000m
# <chr> <int> <int>
# 1 Academy At Boys & Girls Town 24 24
# 2 AcademyOf Envt Sci/math Elementary School 18 18
# 3 AcademyOf Envt Sci/math Middle School 2 2
# 4 Adams Elementary School 12 12
# 5 Ames Visual/perf. Arts 25 25
# 6 Ashland Elementary And Br. 49 49
# 7 Aspire Academy 26 28
# 8 Beaumont Cte High School 46 52
# 9 Bishop DuBourg High School 4 4
# 10 Bryan Hill Elementary School 19 19
# # ... with 124 more rows
这篇关于缓冲区重叠时计数点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文