如何将所有字段/扩展数据(不仅是“名称"和“说明")从KML层加载到R中 [英] How to load all fields/ExtendedData (not just 'name' and 'description') from KML layer into R

查看:111
本文介绍了如何将所有字段/扩展数据(不仅是“名称"和“说明")从KML层加载到R中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力将KML文件加载到R中,以使用Leaflet/Shiny制作Web地图.导入非常简单(使用此示例KML ):

I've been working on loading KML files into R to make web maps with Leaflet/Shiny. The import is pretty simple (using this sample KML):

library(rgdal)

sampleKml <- readOGR("D:/KML_Samples.kml", layer = ogrListLayers("D:/KML_Samples.kml")[1])

在此示例中,ogrListLayers提取所有kml层,而我仅对第一个元素/层进行子集化.十分简单.

In this example, ogrListLayers pulls in all of the kml layers, and I subset only the first element/layer. Easy peasy.

问题在于,使用此方法读取KML图层只会拉入两个字段:名称"和描述",如下所示:

The problem is that using this method to read KML layers only pulls in two fields: "Name" and "Description," as seen below:

> sampleKml <- readOGR("D:/KML_Samples.kml", layer = ogrListLayers("D:/KML_Samples.kml")[1])
OGR data source with driver: KML 
Source: "D:/KML_Samples.kml", layer: "Placemarks"
with 3 features
It has 2 fields
> sampleKml@data
                Name                                                                                  Description
1   Simple placemark Attached to the ground. Intelligently places itself at the height of the underlying terrain.
2 Floating placemark                                                  Floats a defined distance above the ground.
3 Extruded placemark                                              Tethered to the ground by a customizable "tail" 

因此R会将KML层读取为具有3个要素(3个不同点)和两个字段(列)的SpatialPointsDataFrame.但是,当我将图层拖入QGIS并读取其属性表时,除了名称和描述之外,还有许多字段,

So R reads the KML layer as a SpatialPointsDataFrame with 3 features (3 different points) and two fields (the columns). However, when I pull the layer into QGIS and read its attribute table, there are many fields in addition to Name and Description, seen here.

据我所知,名称"和描述"是KML 地标,所有其他数据都被视为 ExtendedData .我想将导入的扩展数据与地标数据一起拉入.

From what I can tell, 'name' and 'description' are KML Placemarks, and any additional data are considered ExtendedData. I want to pull import this extended data along with the placemark data.

是否有一种方法可以将所有这些KML层字段/属性都拉入R中?最好使用readOGR(),但我愿意接受所有建议.

Is there a way to pull ALL of these KML layer fields/attributes into R? Preferably with readOGR(), but I'm open to all suggestions.

推荐答案

TL; DR

潜在的问题是Windows缺少LibKML库.我的解决方案是通过函数直接从KML中提取数据.

TL;DR

The underlying problem is the missing library LibKML for windows. My solution is extracting the data directly from the KML via a function.

我遇到了同样的问题,经过一番搜索之后,看来这与LibKML和Windows有关.在我的Ubuntu计算机上执行相同的代码会产生不同的结果,即在加载保存的KML文件时检索了ExtendedData.

I ran into the same problem and after some googling it appears that this has something to do with LibKML and Windows. Executing the same code on my Ubuntu machine yielded different results, namely the ExtendedData was retrieved when loading the saved KML file.

library(rgdal)
library(dplyr)
poly_df<-data.frame(x=c(1,1,0,0),y=c(1,0,0,1))
poly<-poly_df %>% 
  Polygon %>% 
  list %>% 
  Polygons(ID="1") %>% 
  list %>% 
  SpatialPolygons(proj4string = CRS("+init=epsg:4326")) %>% 
  SpatialPolygonsDataFrame(data=data.frame(test="this is a test"))

writeOGR(poly,"test.kml",driver="KML",layer="poly")
poly2<-readOGR("test.kml")
poly2@data

如果可以成功构建LibKML [1],则他/她将能够使用ExtendedData [2]加载KML文件.

If one would manage to build LibKML [1], s/he would be able to load KML files with the ExtendedData [2].

在Windows上,需要使用Visual Studio 2005 [1]构建LibKML.不再支持此Visual Studio版本[3].在[3]中,user2889419提供了指向2005版本的链接.
我下载并安装了该版本,但是构建LibKML最终失败,并出现许多错误和警告(某些文件不存在).这是我停下来的原因,因为我离我的舒适区很远,但想分享我的追逐结果.

On Windows the LibKML needs to be build with Visual Studio 2005 [1]. This Visual Studio version is not supported anymore [3]. In [3] user2889419 supplies the link to the 2005 version.
I downloaded and installed the version but building LibKML eventually failed with a lot of errors and warnings (certain files do not exist). This is were I stopped because I am way out of my comfort zone but wanted to share the results of my chase.

我的解决方案是直接读取KML,然后在通过rgdal的readOGR加载空间对象的同时提取ExtendedData.我的假设是readOGR像提取例程一样从文件的顶部开始.然后将两者合并,输出为SpatialPolygonsDataFrame.
起初我在从KML文件中提取节点时遇到了一些麻烦,因为我不了解名称空间的概念[4]. (编辑以下功能是因为我遇到了其他来源的KML文件的麻烦.)

My solution is to read the KML directly and then extract the ExtendedData while loading the Spatial Object via rgdal's readOGR. My assumption is that readOGR starts on top of the file as does my extraction routine. Both are then merged and the output is a SpatialPolygonsDataFrame.
I had some troubles extracting the nodes from the KML files at first because I was not aware of the concept of namespaces [4]. (Edited the following function because I ran into troubles with KML files of other origins.)

readKML <- function(file,keep_name_description=FALSE,layer,...) {
  # Set keep_name_description = TRUE to keep "Name" and "Description" columns
  #   in the resulting SpatialPolygonsDataFrame. Only works when there is
  #   ExtendedData in the kml file.

  sp_obj<-readOGR(file,layer,...)
  xml1<-read_xml(file)
  if (!missing(layer)) {
    different_layers <- xml_find_all(xml1, ".//d1:Folder") 
    layer_names <- different_layers %>% 
      xml_find_first(".//d1:name") %>% 
      xml_contents() %>% 
      xml_text()

    selected_layer <- layer_names==layer
    if (!any(selected_layer)) stop("Layer does not exist.")
    xml2 <- different_layers[selected_layer]
  } else {
    xml2 <- xml1
  }

  # extract name and type of variables

  variable_names1 <- 
    xml_find_first(xml2, ".//d1:ExtendedData") %>% 
    xml_children() 

  while(variable_names1 %>% 
        xml_attr("name") %>% 
        is.na() %>% 
        any()&variable_names1 %>%
        xml_children() %>% 
        length>0) variable_names1 <- variable_names1 %>%
    xml_children()

  variable_names <- variable_names1 %>%
    xml_attr("name") %>% 
    unique()

  # return sp_obj if no ExtendedData is present
  if (is.null(variable_names)) return(sp_obj)

  data1 <- xml_find_all(xml2, ".//d1:ExtendedData") %>% 
    xml_children()

  while(data1 %>%
        xml_children() %>% 
        length>0) data1 <- data1 %>%
    xml_children()

  data <- data1 %>% 
    xml_text() %>% 
    matrix(.,ncol=length(variable_names),byrow = TRUE) %>% 
    as.data.frame()

  colnames(data) <- variable_names

  if (keep_name_description) {
    sp_obj@data <- data
  } else {
    try(sp_obj@data <- cbind(sp_obj@data,data),silent=TRUE)
  }
  sp_obj
}

旧版:通过ReadLines提取

我的解决方案是直接读取KML,然后在通过rgdal的readOGR加载空间对象的同时提取ExtendedData.我的假设是readOGR像提取例程一样从文件的顶部开始.然后将两者合并,输出为SpatialPolygonsDataFrame.

Old: extracting via ReadLines

My solution is to read the KML directly and then extract the ExtendedData while loading the Spatial Object via rgdal's readOGR. My assumption is that readOGR starts on top of the file as does my extraction routine. Both are then merged and the output is a SpatialPolygonsDataFrame.

library(tidyverse)
library(rgdal)

readKML<-function(file,keep_name_description=FALSE,...) {
  # Set keep_name_description = TRUE to keep "Name" and "Description" columns 
  #   in the resulting SpatialPolygonsDataFrame. Only works when there is 
  #   ExtendedData in the kml file.

  if (!grepl("\\.kml$",file)) stop("File is not a KML file.")
  if (!file.exists(file)) stop("File does not exist.")
  map<-readOGR(file,...)

  f1<-readLines(file)

  # get positions of ExtendedData in document
  exdata_position<-grep("ExtendedData",f1) %>% 
    matrix(ncol=2,byrow = TRUE) %>% 
    apply(1,function(x) {
      pos<-x[1]:x[2]
      pos[2:(length(pos)-1)]
    }) %>% 
    t %>% 
    as.data.frame

  # if there is no ExtendedData return SpatialPolygonsDataFrame
  if (ncol(exdata_position)==0) return(map)

  # Get Name of different columns
  extract1<-f1[exdata_position[1,] %>% 
                 unlist]  
  names_of_data<-extract1 %>% 
    strsplit("name=\"") %>%
    lapply(function(x) strsplit(x[[2]],split="\"") ) %>%
    unlist(recursive = FALSE) %>%
    lapply(function(x) return(x[1])) %>% 
    unlist

  # Extract Extended Data
  dat<-lapply(seq(nrow(exdata_position)),function(x) {
    extract2<-f1[exdata_position[x,] %>% 
                   unlist]  
    extract2 %>% 
      strsplit(">") %>%
      lapply(function(x) strsplit(x[[2]],split="<") ) %>% unlist(recursive = FALSE) %>%
      lapply(function(x) return(x[1])) %>% 
      unlist %>% 
      matrix(nrow=1) %>% 
      as.data.frame
  }) %>% 
    do.call(rbind,.)

  # Rename columns
  colnames(dat)<-names_of_data

  # Check if Name and Description should be dropped
  if (keep_name_description) {
    map@data<-cbind(map@data,dat)
  } else {
    map@data<-dat
  }
  map
}

[1] https://github.com/google/libkml/wiki/Building-and-installing-libkml
[2] https://github.com/r-spatial/sf/issues/499
[3] 在何处下载Visual Studio Express 2005?
[4] 解析R中的XML:不正确的命名空间

[1] https://github.com/google/libkml/wiki/Building-and-installing-libkml
[2] https://github.com/r-spatial/sf/issues/499
[3] Where to download visual studio express 2005?
[4] Parsing XML in R: Incorrect namespaces

这篇关于如何将所有字段/扩展数据(不仅是“名称"和“说明")从KML层加载到R中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆