从谷歌街道地址中提取城市和州信息 [英] extracting city and state information from a google street address
问题描述
我有一个数据集,其中包含不同点位置的经纬度信息,我想知道每个点与哪个城市和州相关联.
I have a data set that contained lat/long information for different point locations, and I would like to know which city and state are associated with each point.
按照这个示例我使用了<ggmap
中的 code>revgeocode 函数获取每个位置的街道地址,生成以下数据框:
Following this example I used the revgeocode
function from ggmap
to obtain a street address for each location, producing the data frame below:
df <- structure(list(PointID = c(1787L, 2805L, 3025L, 3027L, 3028L,
3029L, 3030L, 3031L, 3033L), Latitude = c(38.36648102, 36.19548585,
43.419774, 43.437222, 43.454722, 43.452643, 43.411949, 43.255479,
43.261464), Longitude = c(-76.4802046, -94.21554661, -87.960399,
-88.018333, -87.974722, -87.978542, -87.94149, -87.986433, -87.968612
), Address = structure(c(2L, 8L, 5L, 3L, 9L, 7L, 4L, 1L, 6L), .Label = c("13004 N Thomas Dr, Mequon, WI 53097, USA",
"2160 Turner Rd, Lusby, MD 20657, USA", "2805 County Rd Y, Saukville, WI 53080, USA",
"3701-3739 County Hwy W, Saukville, WI 53080, USA", "3907 Echo Ln, Saukville, WI 53080, USA",
"4823 W Bonniwell Rd, Mequon, WI 53097, USA", "5100-5260 County Rd I, Saukville, WI 53080, USA",
"7948 W Gibbs Rd, Springdale, AR 72762, USA", "River Park Rd, Saukville, WI 53080, USA"
), class = "factor")), row.names = c(NA, -9L), class = "data.frame", .Names = c("PointID",
"Latitude", "Longitude", "Address"))
我想使用 R 从完整的街道地址中提取城市/州信息,并创建两列来存储这些信息(城市"和州").
I would like to use R to extract the city/state information from the full street address, and create two columns to store this information ("City" and "State).
我假设 stringr
包是可行的方法,但我不确定如何使用它.上面的示例使用以下代码提取邮政编码(在该示例中名为结果").他们的数据集:
I'm assuming the stringr
package is the way to go, but I'm not sure how to go about using it. The example above used the following code to extract the zip code (named "result" in that example). Their data set:
# ID Longitude Latitude result
# 1 311175 41.29844 -72.92918 16 Church Street South, New Haven, CT 06519, USA
# 2 292058 41.93694 -87.66984 1632 West Nelson Street, Chicago, IL 60657, USA
# 3 12979 37.58096 -77.47144 2077-2199 Seddon Way, Richmond, VA 23230, USA
以及提取邮政编码的代码:
And code to extract the zipcode:
library(stringr)
data$zipcode <- substr(str_extract(data$result," [0-9]{5}, .+"),2,6)
data[,-4]
是否可以轻松修改上面的代码来获取城市和州数据?
Is it possible to easily modify the above code to get the city and state data?
推荐答案
您可以使用 revgeocode()
本身获取城市和州:
You can get the city and state using revgeocode()
itself:
df <- cbind(df,do.call(rbind,
lapply(1:nrow(df),
function(i)
revgeocode(as.numeric(
df[i,3:2]), output = "more")[c("administrative_area_level_1","locality")])))
df
# PointID Latitude Longitude Address
# 1 1787 38.36648 -76.48020 2160 Turner Rd, Lusby, MD 20657, USA
# 2 2805 36.19549 -94.21555 7948 W Gibbs Rd, Springdale, AR 72762, USA
# 3 3025 43.41977 -87.96040 3907 Echo Ln, Saukville, WI 53080, USA
# 4 3027 43.43722 -88.01833 2805 County Rd Y, Saukville, WI 53080, USA
# 5 3028 43.45472 -87.97472 River Park Rd, Saukville, WI 53080, USA
# 6 3029 43.45264 -87.97854 5100-5260 County Rd I, Saukville, WI 53080, USA
# 7 3030 43.41195 -87.94149 3701-3739 County Hwy W, Saukville, WI 53080, USA
# 8 3031 43.25548 -87.98643 13004 N Thomas Dr, Mequon, WI 53097, USA
# 9 3033 43.26146 -87.96861 4823 W Bonniwell Rd, Mequon, WI 53097, USA
# administrative_area_level_1 locality
# 1 Maryland Lusby
# 2 Arkansas Springdale
# 3 Wisconsin Saukville
# 4 Wisconsin Saukville
# 5 Wisconsin Saukville
# 6 Wisconsin Saukville
# 7 Wisconsin Saukville
# 8 Wisconsin Mequon
# 9 Wisconsin Mequon
P.S. 您可以一步完成所有操作(包括获取地址或/和邮政编码).只需将 "address"
或/和 "postal_code"
添加到 c("administrative_area_level_1","locality")
这是变量列表你想提取.
P.S. You can do everything (including getting the address or/and zip code) in one step. Just add "address"
or/and "postal_code"
to c("administrative_area_level_1","locality")
which is the list of variables that you want to extract.
这篇关于从谷歌街道地址中提取城市和州信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!