使用扫帚包装整理地图时保留区域名称 [英] keep region names when tidying a map using broom package

查看:128
本文介绍了使用扫帚包装整理地图时保留区域名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用栅格数据包中的getData函数来检索阿根廷地图。我想使用ggplot2绘制结果图,所以我要使用扫帚包中的整齐函数将其转换为数据框。效果很好,但是我不知道如何保留联邦区的名称,以便可以在地图上使用它们。

I am using the getData function from the raster package to retrieve the map of Argentina. I would like to plot the resulting map using ggplot2, so I am converting to a dataframe using the tidy function from the broom package. This works fine, but I can't figure out how to preserve the names of the federal districts so that I can use them on the map.

这是我的原始代码不保留地区名称的地方:

Here is my original code that does not preserve the district names:

# Original code: ##################################
# get the map data from GADM.org and then simplify it
arg_map_1 <- raster::getData(country = "ARG", level = 1, path = "./data/")     %>% 
  # simplify
  rmapshaper::ms_simplify(keep = 0.01) %>% 
  # tidy to a dataframe
  broom::tidy()

# plot the map
library(ggplot2)
ggplot(data=arg_map_1) +
  geom_map(map=arg_map_1, aes(x=long, y=lat, map_id=id, fill=id),
       color="#000000", size=0.25)

以下是带有代码的代码,可将区域名称从SPDF中提取出来并将其用作地图ID:

And here is the code with a hack to pull the district names out of the SPDF and use them as the map IDs:

# Code with a hack to keep the district names: ################################
# get the map data from GADM.org and then simplify it
arg_map_1 <- raster::getData(country = "ARG", level = 1, path = "./data/") %>% 
  # simplify
  rmapshaper::ms_simplify(keep = 0.01)  

for(region_looper in seq_along(arg_map_1@data$NAME_1)){
  arg_map_1@polygons[[region_looper]]@ID <- 
    as.character(arg_map_1@data$NAME_1[region_looper]) 
}

# tidy to a dataframe
arg_map_1 <- arg_map_1 %>% 
  broom::tidy()

library(ggplot2)
ggplot(data=arg_map_1) +
  geom_map(map=arg_map_1, aes(x=long, y=lat, map_id=id, fill=id),
           color="#000000", size=0.25)

我一直认为必须使用某种方式来保留名称的整洁功能,但就我的生活而言,我无法弄清楚。

I keep thinking that there must be some way to use the tidy function that preserves the names, but for the life of me, I can't figure it out.

推荐答案

您可以使用软件包 join 函数> plyr 。这是一个常规解决方案(看起来很长,但实际上很简单):

You can use the join function from package plyr. Here is a general solution (it looks long but it is actually very easy):


  1. 加载shapefile :假设您的工作目录中有一个shapefile my_shapefile.shp 。让我们加载它:

  1. Load shapefile: Let us say you have a shapefile my_shapefile.shp in your working directory. Let's load it:

shape <- readOGR(dsn = "/my_working_directory", layer = "my_shapefile")

请注意,在此shapefile中有一个数据框,可通过 shape @ data访问。例如,此数据框可能看起来像这样:

Notice that inside this shapefile there is a dataframe, which can be accessed with shape@data. For example, this dataframe could look like this:

> head(shape@data)
       code                   region     label
0 E12000006          East of England E12000006
1 E12000007                   London E12000007
2 E12000002               North West E12000002
3 E12000001               North East E12000001
4 E12000004            East Midlands E12000004
5 E12000003 Yorkshire and The Humber E12000003


  • 从shapefile创建新的数据框:使用扫帚包来潮汐shapefile数据框:

  • Create new dataframe from shapefile: Use the broom package to tide the shapefile dataframe:

    new_df <- tidy(shape)
    


  • 结果如下:

    > head(new_df)
          long      lat order  hole piece group id           
    1 547491.0 193549.0     1 FALSE     1   0.1  0 
    2 547472.1 193465.5     2 FALSE     1   0.1  0 
    3 547458.6 193458.2     3 FALSE     1   0.1  0 
    4 547455.6 193456.7     4 FALSE     1   0.1  0 
    5 547451.2 193454.3     5 FALSE     1   0.1  0 
    6 547447.5 193451.4     6 FALSE     1   0.1  0
    

    不幸的是, tidy()丢失了变量名(在此示例中为 region)。取而代之的是,我们获得了一个新变量 id,从0开始。幸运的是, id的顺序与 shape @ data $ region 中存储的顺序相同。让我们用它来恢复名称。

    Unfortunately, tidy() lost the variable names ("region", in this example). Instead, we got a new variable "id", starting at 0. Fortunately, the ordering of "id" is the same as that stored in shape@data$region. Let us use this to recover the names.


    1. 使用行名创建辅助数据框:让我们用行名创建一个新的数据框。此外,我们将添加一个 id变量,该变量与创建的 tidy()相同:

    1. Create auxiliary dataframe with row names: Let us create a new dataframe with the row names. Additionally, we will add an "id" variable, identical to the one tidy() created:

    # Recover row name 
    temp_df <- data.frame(shape@data$region)
    names(temp_df) <- c("region")
    # Create and append "id"
    temp_df$id <- seq(0,nrow(temp_df)-1)
    


  • 使用 id将行名称与新数据框合并:最后,让我们将名称重新放入新数据框:

  • Merge row names with new dataframe using "id": Finally, let us put the names back into the new dataframe:

    new_df <- join(new_df, temp_df, by="id")
    


  • 就是这样!您甚至可以使用 join 命令和 id索引将更多变量添加到新数据框中。最终结果将类似于:

    That's it! You can even add more variables to the new dataframe, by using the join command and the "id" index. The final result would be something like:

    > head(new_df)
          long      lat order  hole piece group id            name    var1    var2 
    1 547491.0 193549.0     1 FALSE     1   0.1  0 East of England   0.525   0.333   
    2 547472.1 193465.5     2 FALSE     1   0.1  0 East of England   0.525   0.333   
    3 547458.6 193458.2     3 FALSE     1   0.1  0 East of England   0.525   0.333   
    4 547455.6 193456.7     4 FALSE     1   0.1  0 East of England   0.525   0.333   
    5 547451.2 193454.3     5 FALSE     1   0.1  0 East of England   0.525   0.333   
    6 547447.5 193451.4     6 FALSE     1   0.1  0 East of England   0.525   0.333   
    

    这篇关于使用扫帚包装整理地图时保留区域名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆