ggplot / mapping美国县 - R中可视化形状的问题 [英] ggplot/mapping US counties — problems with visualization shapes in R

查看:621
本文介绍了ggplot / mapping美国县 - R中可视化形状的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我在R中有一个数据框obesity_map,它基本上给出了每个县的州,县和肥胖率。它看起来或多或少像这样:
$ b $ pre $ obesity_map = data.frame(state,county,obesity_rate)
code>

我试图通过显示美国各县的各种肥胖率来显示地图上的这一点:

  us.state.map<  -  map_data('state')
head(us.state.map)
状态< - 级别(as.factor(us.state.map $ region))
df< - data.frame(region = states,value = runif(length(states),min = 0,max = 100),stringsAsFactors = FALSE)

map.data< - merge(us.state.map,df,by ='region',all = T)
map.data< - map.data [order(map.data $ order)]]
head(map.data)

map.county< - map_data('county')
county肥胖< - data.frame(region = obesity_map $ state,subregion = obesity_map $ county,value = obesity_map $ obesity_rate)
map.county< - 合并(county.obesity,map.county,all = TRUE )
ggplot(map.county,aes(x = long,y = lat,group = group,fill = as.factor(valu e))+ geom_polygon(color =white,size = 0.1)

看起来像这样的图像:



正如你所看到的,美国被分成奇怪的形状,颜色在渐变中不是一种一致的颜色,你不能从中获得太多。但我真正想要的是类似于下面的内容,但每个县都填写了:


我很新,所以我会很感激所有的帮助!




编辑:

这里是dput的输出:

  dput(obesity_map)

X = 1:3141,FIPS = c(1L,3L,5L,7L,9L,11L,
13L,15L,17L,19L,21L,23L,25L,27L,29L,31L,33L,35L, 37L,
39L,41L,43L,45L,47L,49L,51L,53L,55L,57L,59L,61L,63L,
65L,67L,69L,71L,73L,75L,77L, 79L,81L,83L,85L,87L,89L,
91L,93L,95L,97L,99L,101L,103L,105L,107L,109L,111L,$ b $ 113L,115L,117L,119L, 121L,123L,125L,127L,129L,131L,133L,
13L,16L,20L,50L,60L,68L,70L,90L,100L,110L,122L,130L,
150L, 170L,180L,185L,188L,201L,220L,232L,240L,261L,
270L,280L,282L,290L,1L,3L, 5L,7L,9L,11L,12L,13L,15L,
17L,19L,21L,23L,25L,27L,1L,3L,5L,7L,9L,11L,13L,15L, 17L,19L,21L,23L,25L,27L,29L,31L,33L,35L,37L,39L,41L,

这是一个数量巨大的数字,每一个美国县,所以我缩短了结果,并放入了第一对夫妇。



基本上,数据框看起来像这样:

  print(head(obesity_map))


FIPS state_names county_names obesity
1 1 1 Alabama Autauga 24.5
2 2 3阿拉巴马州Baldwin 23.6
3 3 5阿拉巴马州巴伯尔25.6
4 4 7阿拉巴马州比布0.0
5 5 9阿拉巴马州Blount 24.2
6 6 11阿拉巴马州布洛克0.0

我也尝试过使用ggcounty来遵循示例,但我一直收到错误。我不完全确定我做错了什么:

  library(ggcounty)

#打破
obesity_map $ obese< - cut(obesity_map $ obesity,
breaks = c(0,5,10,15,20,25,30),
labels = c(1 ,2,3,4,
5,6),
include.lowest = TRUE)

# map(lower 48)
us< - ggcounty.us()

#用我们的底图开始绘图
gg< - us $ g

#为我们的人口添加一个新的geom(choropleth)
gg< - gg + geom_map(data = obesity_map,map = us $ map,
aes(map_id = FIPS,fill = obesity_map $肥胖),
color =white,size = 0.125)

但我总是结束错误:参数必须强制为非负整数



任何想法?再次感谢你的帮助!我非常感激。

解决方案

也许有点迟到的另一个答案,但仍然值得我分享。



数据的读取和预处理与jlhoward的答案类似,但有一些不同:

  library(tmap )#用于绘制
的库(readxl)#用于阅读Excel
库(maptools)#用于unionSpatialPolygons

#下载数据
download.file(http: //www.ers.usda.gov/datafiles/Food_Environment_Atlas/Data_Access_and_Documentation_Downloads/Current_Version/DataDownload.xls,destfile =DataDownload.xls,mode =wb)
df< - read_excel(DataDownload。

$ b#下载形状(略少于其他脚本中的细节)
f < - tempfile()
download.file(xls,sheet =HEALTH http://www2.census.gov/geo/tiger/GENZ2010/gz_2010_us_050_00_20m.zip,destfile = f)
unzip(f,exdir =。)
US < - read_shape( gz_2010_us_050_00_20m.shp)

#lea (%FI(02,15,和72))
US < - US [!(US $ STATE%in%c(02,15,72 )),]

#将数据附加到形状
US $ FIPS < - paste0(US $ STATE,US $ COUNTY)
US < - append_data(US,df ,key.shp =FIPS,key.data =FIPS)

当正确的数据被附加到形状对象,可以用一行代码绘制一个choropleth:

  qtm(US,fill =PCT_OBESE_ADULTS10 )



 #使用状态多边形创建形状对象
US_states< - unionSpatialPolygons(US,IDs = US $ STATE)

tm_shape(US,projection =+ init = epsg:2163)+
tm_polygons(PCT_OBESE_ADULTS10 ,border.col =grey30,title =)+
tm_sha pe(US_states)+
tm_borders(lwd = 2,col =black,alpha = .5)+
tm_layout(title =2010年各县成人肥胖症百分比,
title .position = c(center,top),
legend.text.size = 1)


So I have a data frame in R called obesity_map which basically gives me the state, county, and obesity rate per county. It looks more or less like this:

obesity_map = data.frame(state, county, obesity_rate)

I'm trying to visualize this on the map by showing various obesity rates per county throughout the US with this:

us.state.map <- map_data('state')
head(us.state.map)
states <- levels(as.factor(us.state.map$region))
df <- data.frame(region = states, value = runif(length(states), min=0, max=100),stringsAsFactors = FALSE)

map.data <- merge(us.state.map, df, by='region', all=T)
map.data <- map.data[order(map.data$order),]
head(map.data)

map.county <- map_data('county')
county.obesity <- data.frame(region = obesity_map$state, subregion = obesity_map$county, value = obesity_map$obesity_rate)
map.county <- merge(county.obesity, map.county, all=TRUE)
ggplot(map.county, aes(x = long, y = lat, group=group, fill=as.factor(value))) + geom_polygon(colour = "white", size = 0.1)

And it basically creates an image that looks like this:

As you can see, the US is divided into strange shapes, the colors aren't one consistent color in varying gradients, and you can't make much from it. But what I really want is something like this below but with each county filled in:

I'm fairly new to this so I'd appreciate any and all help!


Edit:

Here's the output of the dput:

dput(obesity_map)

structure(list(X = 1:3141, FIPS = c(1L, 3L, 5L, 7L, 9L, 11L, 13L, 15L, 17L, 19L, 21L, 23L, 25L, 27L, 29L, 31L, 33L, 35L, 37L, 39L, 41L, 43L, 45L, 47L, 49L, 51L, 53L, 55L, 57L, 59L, 61L, 63L, 65L, 67L, 69L, 71L, 73L, 75L, 77L, 79L, 81L, 83L, 85L, 87L, 89L, 91L, 93L, 95L, 97L, 99L, 101L, 103L, 105L, 107L, 109L, 111L, 113L, 115L, 117L, 119L, 121L, 123L, 125L, 127L, 129L, 131L, 133L, 13L, 16L, 20L, 50L, 60L, 68L, 70L, 90L, 100L, 110L, 122L, 130L, 150L, 164L, 170L, 180L, 185L, 188L, 201L, 220L, 232L, 240L, 261L, 270L, 280L, 282L, 290L, 1L, 3L, 5L, 7L, 9L, 11L, 12L, 13L, 15L, 17L, 19L, 21L, 23L, 25L, 27L, 1L, 3L, 5L, 7L, 9L, 11L, 13L, 15L, 17L, 19L, 21L, 23L, 25L, 27L, 29L, 31L, 33L, 35L, 37L, 39L, 41L,

It's a huge amount of numbers because it's for every US county so I abbreviated the results and put in the first couple lines.

Basically, the data frame looks like this though:

print(head(obesity_map))


  X FIPS state_names county_names obesity
1 1    1     Alabama      Autauga    24.5
2 2    3     Alabama      Baldwin    23.6
3 3    5     Alabama      Barbour    25.6
4 4    7     Alabama         Bibb     0.0
5 5    9     Alabama       Blount    24.2
6 6   11     Alabama      Bullock     0.0

I also tried using ggcounty by following the example put up but I keep getting an error. I'm not entirely sure what I've done wrong:

library(ggcounty)

# breaks
obesity_map$obese <- cut(obesity_map$obesity, 
                  breaks=c(0, 5, 10, 15, 20, 25, 30), 
                  labels=c("1", "2", "3", "4", 
                           "5", "6"),
                  include.lowest=TRUE)

# get the US counties map (lower 48)
us <- ggcounty.us()

# start the plot with our base map
gg <- us$g

# add a new geom with our population (choropleth)
gg <- gg + geom_map(data=obesity_map, map=us$map,
                aes(map_id=FIPS, fill=obesity_map$obese), 
                color="white", size=0.125)

But I always end up getting an error saying: "Error: Argument must be coercible to non-negative integer"

Any idea? Thanks again for all your help! I appreciate it so much.

解决方案

Maybe a little late for another answer, but still worthwhile to share I think.

The reading and preprocessing of the data is similar to jlhoward's answer, with some differences:

library(tmap)      # package for plotting
library(readxl)    # for reading Excel
library(maptools)  # for unionSpatialPolygons

# download data
download.file("http://www.ers.usda.gov/datafiles/Food_Environment_Atlas/Data_Access_and_Documentation_Downloads/Current_Version/DataDownload.xls", destfile = "DataDownload.xls", mode="wb")
df <- read_excel("DataDownload.xls", sheet = "HEALTH")

# download shape (a little less detail than in the other scripts)
f <- tempfile()
download.file("http://www2.census.gov/geo/tiger/GENZ2010/gz_2010_us_050_00_20m.zip", destfile = f)
unzip(f, exdir = ".")
US <- read_shape("gz_2010_us_050_00_20m.shp")

# leave out AK, HI, and PR (state FIPS: 02, 15, and 72)
US <- US[!(US$STATE %in% c("02","15","72")),]  

# append data to shape
US$FIPS <- paste0(US$STATE, US$COUNTY)
US <- append_data(US, df, key.shp = "FIPS", key.data = "FIPS")

When the correct data is attached to the shape object, a choropleth can be drawn with one line of code:

qtm(US, fill = "PCT_OBESE_ADULTS10")

This could be enhanced by adding state borders, a better projection, and a title:

# create shape object with state polygons
US_states <- unionSpatialPolygons(US, IDs=US$STATE)

tm_shape(US, projection="+init=epsg:2163") +
  tm_polygons("PCT_OBESE_ADULTS10", border.col = "grey30", title="") +
tm_shape(US_states) +
  tm_borders(lwd=2, col = "black", alpha = .5) +
tm_layout(title="2010 Adult Obesity by County, percent", 
          title.position = c("center", "top"),
          legend.text.size=1)

这篇关于ggplot / mapping美国县 - R中可视化形状的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆