使用ggplot2基于人口普查数据绘制地图 [英] Drawing maps based on census data using ggplot2

查看：381 发布时间：2018/4/24 21:58:44 r ggplot2 geospatial census

本文介绍了使用ggplot2基于人口普查数据绘制地图的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一份要使用ggplot2在旧金山地图上叠加的点的列表。
每个点都是经度，纬度对。
我希望得到的地图位于经度/纬度坐标系中。
我设法重现了Hadley Wickham的绘制多边形shapefile的方向使用他的示例文件。我使用R 2.15.1 for Windows。

但是，我尝试使用从 UScensus2010cdp软件包。
这里是我的代码片段：

$ p $ require（rgdal） require（maptools） require（ggplot2） require（sp） require（plyr） gpclibPermit（）fortify方法需要的 require（UScensus2010） require（UScensus2010cdp） data（california.cdp10） sf< - city（name =san francisco，state =ca） sf.points = fortify（ sf）

我得到以下错误：

 使用名称来定义区域。 
 unionSpatialPolygons中的错误（cp，invert（polys））：输入长度不同
另外：警告信息：
在split（as.numeric（row.names（attr）），addNA（ attr [，region]，TRUE））：
强制引入的新来港元

有人知道：

对fortify（）的region参数有什么好处？

如果失败了，那么ggplot2可以绘制的旧金山未经转换的纬度/经度坐标的地图数据源？

或者，我发现这里另一张旧金山地图，其数据已翻译。你能告诉我如何将这些数据翻译成原始的经纬度或者对我的点数进行反向翻译吗？

解决方案

注意：

无法访问 UScensus2010cdp ，所以我使用 UScensus2000cpd ，它复制错误。 / li>

问题

问题在于 fortify.SpatialPolygonsDataFrame 依赖于将 row.names 转换为数字，并且数据的rownames是标识符。
ggplot2 ::: fortify.SpatialPolygonsDataFrame 函数（model，data，region = NULL，...） { attr< - as.data.frame（model） if（is.null（reg ）$ { region < - 名称（attr）[1] 消息（使用，region，来定义区域） } polys< - 分割（as.numeric（row.names（attr）），addNA（attr [， region]，TRUE）） cp < - 多边形（模型） try_require（c （cp，invert（poly）） coords< - 强化（联合） coords $ order< - 1： nrow（coords） coords }
您的情况
row.names（sf @ data） ## [1]california_586california_590california_616
是您希望用作区域参数的标识符，如 place state 和 name 不能唯一地标识这三个多边形。
＃as.character用于强制因子 lapply（lapply（sf @ data [，c（'place '，'state'，'name'）]，unique），as.character） ## $ place ## [1]67000 ## ＃＃$ state ## [1]06 ## ## $ name ## [1]旧金山
作为元素以字母开头的字符向量，强制转换为数字时，变为 NA $ b $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ ## $ [$] NA ##警告信息： ##强制引入$ NA

其中是给出的警告之一

解决方案

将列定义为rownames

将row.names设置为 NULL 或 1：nrow（sf @ data）

$ b所以..

  sf @ data [['place_id']]<  -  rownames（sf @ data）
 row.names（sf @ data）<  -  NULL 
 
＃fortify 
 sf_ggplot<  -  fortify（sf，region ='place_id'）
＃合并添加原始数据
 sf_ggplot_all<  -  merge（sf_ggplot，sf @data，by.x ='id'，by.y ='place_id'）
＃非常基本和无趣的情节
 ggplot（sf_ggplot_all，aes（x = long，y = lat，group = group） ））+ 
 geom_polygon（aes（fill = pop2000））+ 
 coord_map（）

I have a list of points that I want to overlay on a map of San Francisco using ggplot2. Each point is a longitude, latitude pair. I want the resulting map to be in a longitude/latitude coordinate system. I managed to reproduce Hadley Wickham's directions for plotting polygon shapefiles using his example file. I am using R 2.15.1 for Windows.

However, I tried to use cdp files downloaded from the UScensus2010cdp package. Here's my code snippet:

require("rgdal") 
require("maptools")
require("ggplot2")
require("sp")
require("plyr")
gpclibPermit() # required for fortify method
require(UScensus2010)
require(UScensus2010cdp)
data(california.cdp10)
sf <- city(name = "san francisco", state="ca")
sf.points = fortify(sf)

I get the following error:

Using name to define regions.
Error in unionSpatialPolygons(cp, invert(polys)) : input lengths differ
In addition: Warning message:
In split(as.numeric(row.names(attr)), addNA(attr[, region], TRUE)) :
   NAs introduced by coercion

Does anybody know:

What is a good value to give to the region parameter of fortify()?
If that fails, a source of map data with untransformed lat/long coordinates for San Francisco that ggplot2 can draw?
Alternatively, I found here another map of San Francisco, whose data is translated. Can you tell me how to either translate this data to raw lat/long or make the reverse translation for my set of points?

解决方案

note:

unable to access UScensus2010cdp, so am using UScensus2000cpd which replicates the error.

The issue

The issue arises from the fact that fortify.SpatialPolygonsDataFrame relies on converting the row.names to numeric, and the rownames of your data are the identifiers.

ggplot2:::fortify.SpatialPolygonsDataFrame 

function (model, data, region = NULL, ...) 
{
    attr <- as.data.frame(model)
    if (is.null(region)) {
        region <- names(attr)[1]
        message("Using ", region, " to define regions.")
    }
    polys <- split(as.numeric(row.names(attr)), addNA(attr[, 
        region], TRUE))
    cp <- polygons(model)
    try_require(c("gpclib", "maptools"))
    unioned <- unionSpatialPolygons(cp, invert(polys))
    coords <- fortify(unioned)
    coords$order <- 1:nrow(coords)
    coords
}

In your case

row.names(sf@data)
## [1] "california_586" "california_590" "california_616"

are the identifiers you wish to use as the region parameters, as place state and name do not uniquely identify the three polygons.

# as.character used to coerce from factor
lapply(lapply(sf@data[,c('place','state','name')], unique), as.character)
## $place
## [1] "67000"
## 
## $state
## [1] "06"
## 
## $name
## [1] "San Francisco"

As a character vector where the elements begin with alphabetic characters, when coerced to numeric, it becomes NA

as.numeric(rownames(sf@data))
## [1] NA NA NA
## Warning message:
## NAs introduced by coercion

Which is one of the warnings given

Solution

Define a column to be the rownames
Set the row.names to NULL or 1:nrow(sf@data)

So..

# rownames
sf@data[['place_id']] <- rownames(sf@data)
row.names(sf@data) <- NULL

# fortify
sf_ggplot <- fortify(sf, region = 'place_id')
# merge to add the original data
sf_ggplot_all <- merge(sf_ggplot, sf@data, by.x = 'id', by.y = 'place_id')
# very basic and uninteresting plot
ggplot(sf_ggplot_all,aes(x=long,y=lat, group = group)) + 
  geom_polygon(aes(fill =pop2000)) + 
  coord_map()

这篇关于使用ggplot2基于人口普查数据绘制地图的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用ggplot2基于人口普查数据绘制地图 [英] Drawing maps based on census data using ggplot2

问题描述

注意：

问题

解决方案

note:

The issue

Solution

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用ggplot2基于人口普查数据绘制地图 [英] Drawing maps based on census data using ggplot2

问题描述

注意：

问题

解决方案

note:

The issue

Solution

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭