st_join 几何和分组列在一起 [英] st_join on geometry and grouping column together

查看:44
本文介绍了st_join 几何和分组列在一起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果您有空间点和多边形时间序列数据,您如何进行空间连接/合并和正常"操作?将非空间变量合并在一起?

If you have spatial point and polygon time series data, how do you do a spatial join/merge and "normal" merge of a non-spatial variable together?

我想合并成年度多边形的点数据,然后按年份汇总 (xvar):

Point data over years that I want to merge into yearly polygons and then summarise (xvar) by year:

#spatial point data by year
library(sf)
set.seed(10)
df_point <- data.frame(id = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3,
                              4, 4, 5, 5,
                              6, 6, 7, 7), 
                       year = c(2016, 2017, 2018, 2019, 2016, 2017, 2018, 2019, 2016, 2017,
                                2016, 2017, 2016, 2017,
                                2016, 2017, 2016, 2017),
                       xvar = sample(1:10, 18, replace = T))
df_point$geometry <- st_cast(st_sfc(st_multipoint(rbind(c(.1, .2), c(.1, .2), c(.1, .2), c(.1, .2),
                                                        c(.3, 1), c(.3, 1), c(.3, 1), c(.3, 1),
                                                        c(1, 1), c(1, 1),
                                                        
                                                        c(2, 2.1), c(2, 2.1), c(2.2, 2.4), c(2.2, 2.4),
                                                        c(4, 2.1), c(4, 2.1), c(4, 2.2), c(4, 2.2)))), "POINT")
                                                        
df_point <- st_as_sf(df_point)
df_point
# Simple feature collection with 18 features and 3 fields
# geometry type:  POINT
# dimension:      XY
# bbox:           xmin: 0.1 ymin: 0.2 xmax: 4 ymax: 2.4
# CRS:            NA
# First 10 features:
#    id year xvar        geometry
# 1   1 2016    9 POINT (0.1 0.2)
# 2   1 2017   10 POINT (0.1 0.2)
# 3   1 2018    7 POINT (0.1 0.2)
# 4   1 2019    8 POINT (0.1 0.2)
# 5   2 2016    6   POINT (0.3 1)
# 6   2 2017    7   POINT (0.3 1)
# 7   2 2018    3   POINT (0.3 1)
# 8   2 2019    8   POINT (0.3 1)
# 9   3 2016   10     POINT (1 1)
# 10  3 2017    7     POINT (1 1)

和多边形数据:

df_poly <- data.frame(poly_id = c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3), 
                      year = rep(2016:2019, each = 3))  
pol = st_polygon(list(rbind(c(0, 0), c(2, 0), c(2, 2), c(0, 2), c(0, 0))))
b = st_sfc(pol, pol + c(2, 2), pol + c(4, .8))
df_poly$geomtry <- c(b, b, b, b)
df_poly <- st_as_sf(df_poly)
df_poly
# Simple feature collection with 12 features and 2 fields
# geometry type:  POLYGON
# dimension:      XY
# bbox:           xmin: 0 ymin: 0 xmax: 6 ymax: 4
# CRS:            NA
# First 10 features:
#    poly_id year                        geomtry
# 1        1 2016 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 2        2 2016 POLYGON ((2 2, 4 2, 4 4, 2 ...
# 3        3 2016 POLYGON ((4 0.8, 6 0.8, 6 2...
# 4        1 2017 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 5        2 2017 POLYGON ((2 2, 4 2, 4 4, 2 ...
# 6        3 2017 POLYGON ((4 0.8, 6 0.8, 6 2...
# 7        1 2018 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 8        2 2018 POLYGON ((2 2, 4 2, 4 4, 2 ...
# 9        3 2018 POLYGON ((4 0.8, 6 0.8, 6 2...
# 10       1 2019 POLYGON ((0 0, 2 0, 2 2, 0 ...

期望的输出:

df_sf_merge
# Simple feature collection with 12 features and 3 fields
# geometry type:  POLYGON
# dimension:      XY
# bbox:           xmin: 0 ymin: 0 xmax: 6 ymax: 4
# CRS:            NA
#    poly_id year total_sum                        geomtry
# 1        1 2016        25 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 2        2 2016        32 POLYGON ((2 2, 4 2, 4 4, 2 ...
# 3        3 2016        14 POLYGON ((4 0.8, 6 0.8, 6 2...
# 4        1 2017        24 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 5        2 2017        22 POLYGON ((2 2, 4 2, 4 4, 2 ...
# 6        3 2017        12 POLYGON ((4 0.8, 6 0.8, 6 2...
# 7        1 2018        10 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 8        2 2018        NA POLYGON ((2 2, 4 2, 4 4, 2 ...
# 9        3 2018        NA POLYGON ((4 0.8, 6 0.8, 6 2...
# 10       1 2019        16 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 11       2 2019        NA POLYGON ((2 2, 4 2, 4 4, 2 ...
# 12       3 2019        NA POLYGON ((4 0.8, 6 0.8, 6 2...

一个时间点的一般方法是这样的:

The general approach for one time point would be something like:

df_sf_merge <- df_poly %>% 
  st_join(df_point) %>%  #AND MERGE OF YEAR?
  group_by(poly_id, year) %>% #year.x or year.y
  summarise(total_sum = sum(xvar, na.rm = T))

但这不起作用,因为合并会创建重复的副本:

but this won't work because the merge creates duplicate copies:

df_sf_merge <- df_poly %>% 
  st_join(df_point) %>% 
  dplyr::arrange(id, year.x)
df_sf_merge
# Simple feature collection with 88 features and 5 fields
# geometry type:  POLYGON
# dimension:      XY
# bbox:           xmin: 0 ymin: 0 xmax: 6 ymax: 4
# CRS:            NA
# First 10 features:
#    poly_id year.x id year.y xvar                        geomtry
# 1        1   2016  1   2016    9 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 2        1   2016  1   2017   10 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 3        1   2016  1   2018    7 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 4        1   2016  1   2019    8 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 5        1   2017  1   2016    9 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 6        1   2017  1   2017   10 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 7        1   2017  1   2018    7 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 8        1   2017  1   2019    8 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 9        1   2018  1   2016    9 POLYGON ((0 0, 2 0, 2 2, 0 ...
# 10       1   2018  1   2017   10 POLYGON ((0 0, 2 0, 2 2, 0 ...

我可以以一种全面的方式删除重复项,但我不希望首先制作重复的副本,因为当我处理大文件时,它会大大减慢进程速度.

I could, in a round about way remove the duplicates but I don't want the duplicate copies to made in the first place as it slows the process down considerably as I am working with large files.

我不确定您是否可以同时进行空间连接和正常连接,但我确定有更简单的解决方法吗?

I'm not sure if you can do a spatial and normal join at the same time but I'm sure theres an easier work around?

有什么建议吗?谢谢

推荐答案

一种解决方案是将两个数据帧拆分为两个由单独的数据帧组成的每年列表,然后使用 map2().所以,2016 年的点被 st_joined()ed 到 2016 年的多边形,2017 年的点到 2017 年的多边形等.

One solution is to split the two dataframes into two lists composed of separate dataframes for each year and then iterate over them using map2(). So, 2016 points get st_joined()ed to only the 2016 polygons, and 2017 points to 2017 polygons, etc.

map2_dfr()map2() 相同,除了它将结果列表展平为数据帧.

map2_dfr() is the same as map2(), except it flattens the resulting list into a dataframe.

library(dplyr)
library(purrr)
df_point_list <- split(select(df_point, -year), # drop the year column for one of these objects so we don't get year.x and year.y
                       df_point$year)
df_poly_list <- split(df_poly, df_poly$year)


df_sf_merge<- map2_dfr(df_poly_list, df_point_list,
                         ~ .x %>% 
                           st_join(.y) %>% 
                           group_by(poly_id, year) %>% 
                           summarise(total_sum = sum(xvar, na.rm = T)))

df_sf_merge

Simple feature collection with 12 features and 3 fields
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: 0 ymin: 0 xmax: 6 ymax: 4
CRS:            NA
First 10 features:
   poly_id year total_sum                        geomtry
1        1 2016        25 POLYGON ((0 0, 2 0, 2 2, 0 ...
2        2 2016        32 POLYGON ((2 2, 4 2, 4 4, 2 ...
3        3 2016        14 POLYGON ((4 0.8, 6 0.8, 6 2...
4        1 2017        24 POLYGON ((0 0, 2 0, 2 2, 0 ...
5        2 2017        22 POLYGON ((2 2, 4 2, 4 4, 2 ...
6        3 2017        12 POLYGON ((4 0.8, 6 0.8, 6 2...
7        1 2018        10 POLYGON ((0 0, 2 0, 2 2, 0 ...
8        2 2018         0 POLYGON ((2 2, 4 2, 4 4, 2 ...
9        3 2018         0 POLYGON ((4 0.8, 6 0.8, 6 2...
10       1 2019        16 POLYGON ((0 0, 2 0, 2 2, 0 ...

这篇关于st_join 几何和分组列在一起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆