如何使用数据科学工具箱对简单地址进行地理编码 [英] How to GeoCode a simple address using Data Science Toolbox

查看:28
本文介绍了如何使用数据科学工具箱对简单地址进行地理编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我受够了Google的地理编码,决定尝试一下替代方案。数据科学工具包(http://www.datasciencetoolkit.org)允许您对无限数量的地址进行地理编码。R有一个非常好的包,可以作为其函数的包装器(cran:RDSTK)。该软件包具有名为street2coordinates()的函数,该函数与数据科学工具包的地理编码实用程序接口。

但是,如果您尝试对诸如City,Country之类的简单内容进行地理编码,则RDSTK函数street2coordinates()不起作用。在下面的示例中,我将尝试使用函数获取凤凰城的纬度和经度:

> require("RDSTK")
> street2coordinates("Phoenix+Arizona+United+States")
[1] full.address
<0 rows> (or 0-length row.names)

数据科学工具包中的实用程序运行良好。这是给出答案的URL请求: http://www.datasciencetoolkit.org/maps/api/geocode/json?sensor=false&address=Phoenix+Arizona+United+States

我对多个地址(完整的地址和城市名称)进行地理编码感兴趣。我知道数据科学工具包URL将正常工作。

如何与URL交互,并将多个纬度和经度放入包含地址的数据框中?

以下是示例数据集:

dff <- data.frame(address=c(
  "Birmingham, Alabama, United States",
  "Mobile, Alabama, United States",
  "Phoenix, Arizona, United States",
  "Tucson, Arizona, United States",
  "Little Rock, Arkansas, United States",
  "Berkeley, California, United States",
  "Duarte, California, United States",
  "Encinitas, California, United States",
  "La Jolla, California, United States",
  "Los Angeles, California, United States",
  "Orange, California, United States",
  "Redwood City, California, United States",
  "Sacramento, California, United States",
  "San Francisco, California, United States",
  "Stanford, California, United States",
  "Hartford, Connecticut, United States",
  "New Haven, Connecticut, United States"
  ))

推荐答案

如下:

library(httr)
library(rjson)

data <- paste0("[",paste(paste0(""",dff$address,"""),collapse=","),"]")
url  <- "http://www.datasciencetoolkit.org/street2coordinates"
response <- POST(url,body=data)
json     <- fromJSON(content(response,type="text"))
geocode  <- do.call(rbind,sapply(json,
                                 function(x) c(long=x$longitude,lat=x$latitude)))
geocode
#                                                long      lat
# San Francisco, California, United States -117.88536 35.18713
# Mobile, Alabama, United States            -88.10318 30.70114
# La Jolla, California, United States      -117.87645 33.85751
# Duarte, California, United States        -118.29866 33.78659
# Little Rock, Arkansas, United States      -91.20736 33.60892
# Tucson, Arizona, United States           -110.97087 32.21798
# Redwood City, California, United States  -117.88536 35.18713
# New Haven, Connecticut, United States     -72.92751 41.36571
# Berkeley, California, United States      -122.29673 37.86058
# Hartford, Connecticut, United States      -72.76356 41.78516
# Sacramento, California, United States    -121.55541 38.38046
# Encinitas, California, United States     -116.84605 33.01693
# Birmingham, Alabama, United States        -86.80190 33.45641
# Stanford, California, United States      -122.16750 37.42509
# Orange, California, United States        -117.85311 33.78780
# Los Angeles, California, United States   -117.88536 35.18713

这利用了Street2coels API(documented here)的POST接口,该接口在一个请求中返回所有结果,而不是使用多个GET请求。

缺少Phoenix似乎是Street2coels API中的一个错误。如果您转到API demo page并尝试美国亚利桑那州凤凰城,则得到的响应为空。但是,如您的示例所示,使用他们的&google风格的地理编码器&确实会给出Phoenix的结果。因此,这里有一个使用重复GET请求的解决方案。请注意,此操作的运行速度要慢得多

geo.dsk <- function(addr){ # single address geocode with data sciences toolkit
  require(httr)
  require(rjson)
  url      <- "http://www.datasciencetoolkit.org/maps/api/geocode/json"
  response <- GET(url,query=list(sensor="FALSE",address=addr))
  json <- fromJSON(content(response,type="text"))
  loc  <- json['results'][[1]][[1]]$geometry$location
  return(c(address=addr,long=loc$lng, lat= loc$lat))
}
result <- do.call(rbind,lapply(as.character(dff$address),geo.dsk))
result <- data.frame(result)
result
#                                     address         long        lat
# 1        Birmingham, Alabama, United States   -86.801904  33.456412
# 2            Mobile, Alabama, United States   -88.103184  30.701142
# 3           Phoenix, Arizona, United States -112.0733333 33.4483333
# 4            Tucson, Arizona, United States  -110.970869  32.217975
# 5      Little Rock, Arkansas, United States   -91.207356  33.608922
# 6       Berkeley, California, United States   -122.29673  37.860576
# 7         Duarte, California, United States  -118.298662  33.786594
# 8      Encinitas, California, United States  -116.846046  33.016928
# 9       La Jolla, California, United States  -117.876447  33.857515
# 10   Los Angeles, California, United States  -117.885359  35.187133
# 11        Orange, California, United States  -117.853112  33.787795
# 12  Redwood City, California, United States  -117.885359  35.187133
# 13    Sacramento, California, United States  -121.555406  38.380456
# 14 San Francisco, California, United States  -117.885359  35.187133
# 15      Stanford, California, United States    -122.1675   37.42509
# 16     Hartford, Connecticut, United States   -72.763564   41.78516
# 17    New Haven, Connecticut, United States   -72.927507  41.365709

这篇关于如何使用数据科学工具箱对简单地址进行地理编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆