使用扫帚包装整理地图时保留区域名称 [英] keep region names when tidying a map using broom package
问题描述
我正在使用栅格数据包中的getData函数来检索阿根廷地图。我想使用ggplot2绘制结果图,所以我要使用扫帚包中的整齐函数将其转换为数据框。效果很好,但是我不知道如何保留联邦区的名称,以便可以在地图上使用它们。
I am using the getData function from the raster package to retrieve the map of Argentina. I would like to plot the resulting map using ggplot2, so I am converting to a dataframe using the tidy function from the broom package. This works fine, but I can't figure out how to preserve the names of the federal districts so that I can use them on the map.
这是我的原始代码不保留地区名称的地方:
Here is my original code that does not preserve the district names:
# Original code: ##################################
# get the map data from GADM.org and then simplify it
arg_map_1 <- raster::getData(country = "ARG", level = 1, path = "./data/") %>%
# simplify
rmapshaper::ms_simplify(keep = 0.01) %>%
# tidy to a dataframe
broom::tidy()
# plot the map
library(ggplot2)
ggplot(data=arg_map_1) +
geom_map(map=arg_map_1, aes(x=long, y=lat, map_id=id, fill=id),
color="#000000", size=0.25)
以下是带有代码的代码,可将区域名称从SPDF中提取出来并将其用作地图ID:
And here is the code with a hack to pull the district names out of the SPDF and use them as the map IDs:
# Code with a hack to keep the district names: ################################
# get the map data from GADM.org and then simplify it
arg_map_1 <- raster::getData(country = "ARG", level = 1, path = "./data/") %>%
# simplify
rmapshaper::ms_simplify(keep = 0.01)
for(region_looper in seq_along(arg_map_1@data$NAME_1)){
arg_map_1@polygons[[region_looper]]@ID <-
as.character(arg_map_1@data$NAME_1[region_looper])
}
# tidy to a dataframe
arg_map_1 <- arg_map_1 %>%
broom::tidy()
library(ggplot2)
ggplot(data=arg_map_1) +
geom_map(map=arg_map_1, aes(x=long, y=lat, map_id=id, fill=id),
color="#000000", size=0.25)
我一直认为必须使用某种方式来保留名称的整洁功能,但就我的生活而言,我无法弄清楚。
I keep thinking that there must be some way to use the tidy function that preserves the names, but for the life of me, I can't figure it out.
推荐答案
您可以使用软件包 join
函数> plyr 。这是一个常规解决方案(看起来很长,但实际上很简单):
You can use the join
function from package plyr
. Here is a general solution (it looks long but it is actually very easy):
-
加载shapefile :假设您的工作目录中有一个shapefile
my_shapefile.shp
。让我们加载它:
Load shapefile: Let us say you have a shapefile
my_shapefile.shp
in your working directory. Let's load it:
shape <- readOGR(dsn = "/my_working_directory", layer = "my_shapefile")
请注意,在此shapefile中有一个数据框,可通过 shape @ data访问
。例如,此数据框可能看起来像这样:
Notice that inside this shapefile there is a dataframe, which can be accessed with shape@data
. For example, this dataframe could look like this:
> head(shape@data)
code region label
0 E12000006 East of England E12000006
1 E12000007 London E12000007
2 E12000002 North West E12000002
3 E12000001 North East E12000001
4 E12000004 East Midlands E12000004
5 E12000003 Yorkshire and The Humber E12000003
从shapefile创建新的数据框:使用扫帚
包来潮汐shapefile数据框:
Create new dataframe from shapefile: Use the broom
package to tide the shapefile dataframe:
new_df <- tidy(shape)
结果如下:
> head(new_df)
long lat order hole piece group id
1 547491.0 193549.0 1 FALSE 1 0.1 0
2 547472.1 193465.5 2 FALSE 1 0.1 0
3 547458.6 193458.2 3 FALSE 1 0.1 0
4 547455.6 193456.7 4 FALSE 1 0.1 0
5 547451.2 193454.3 5 FALSE 1 0.1 0
6 547447.5 193451.4 6 FALSE 1 0.1 0
不幸的是, tidy()
丢失了变量名(在此示例中为 region)。取而代之的是,我们获得了一个新变量 id,从0开始。幸运的是, id的顺序与 shape @ data $ region
中存储的顺序相同。让我们用它来恢复名称。
Unfortunately, tidy()
lost the variable names ("region", in this example). Instead, we got a new variable "id", starting at 0. Fortunately, the ordering of "id" is the same as that stored in shape@data$region
. Let us use this to recover the names.
-
使用行名创建辅助数据框:让我们用行名创建一个新的数据框。此外,我们将添加一个 id变量,该变量与创建的
tidy()
相同:
Create auxiliary dataframe with row names: Let us create a new dataframe with the row names. Additionally, we will add an "id" variable, identical to the one
tidy()
created:
# Recover row name
temp_df <- data.frame(shape@data$region)
names(temp_df) <- c("region")
# Create and append "id"
temp_df$id <- seq(0,nrow(temp_df)-1)
使用 id将行名称与新数据框合并:最后,让我们将名称重新放入新数据框:
Merge row names with new dataframe using "id": Finally, let us put the names back into the new dataframe:
new_df <- join(new_df, temp_df, by="id")
就是这样!您甚至可以使用 join
命令和 id索引将更多变量添加到新数据框中。最终结果将类似于:
That's it! You can even add more variables to the new dataframe, by using the join
command and the "id" index. The final result would be something like:
> head(new_df)
long lat order hole piece group id name var1 var2
1 547491.0 193549.0 1 FALSE 1 0.1 0 East of England 0.525 0.333
2 547472.1 193465.5 2 FALSE 1 0.1 0 East of England 0.525 0.333
3 547458.6 193458.2 3 FALSE 1 0.1 0 East of England 0.525 0.333
4 547455.6 193456.7 4 FALSE 1 0.1 0 East of England 0.525 0.333
5 547451.2 193454.3 5 FALSE 1 0.1 0 East of England 0.525 0.333
6 547447.5 193451.4 6 FALSE 1 0.1 0 East of England 0.525 0.333
这篇关于使用扫帚包装整理地图时保留区域名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!