使用 R 下载和读取压缩的 xml 文件 [英] Using R to download and read zipped xml file

查看:28
本文介绍了使用 R 下载和读取压缩的 xml 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 Dirk Eddelbuettel 的 this 回答,我正在尝试从中读取 xml 文件用于进一步处理的 zip 存档.除了 URL 和文件名之外,对所引用代码的唯一更改是我将 read.table 更改为 xmlInternalTreeParse.

Based on this answer by Dirk Eddelbuettel I am trying to read an xml file from a zip archive for further processing. Apart from URL and filenames the only change to the code referenced is that I changed read.table to xmlInternalTreeParse.

library(XML)
temp <- tempfile()
download.file("http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&downfile=data%2Fnrg_105a.sdmx.zip",temp)
doc <- xmlInternalTreeParse(unz(temp, "nrg_105a.dsd.xml"))
fileunlink(temp)
closeAllConnections()

但是,这会返回以下错误:

However, this returns the following error:

Error in file.exists(file) : invalid 'file' argument

traceback() 表明这是来自解析器内部的函数调用.所以在这种情况下 temp 似乎是一个不合适的参考.有没有办法让这个工作?

traceback()shows that this is a function call from within the parser. So temp seems to be an inappropriate reference in this context. Is there a way to make this work?

推荐答案

你可以试试:

# Make a temporary file (tf) and a temporary folder (tdir)
tf <- tempfile(tmpdir = tdir <- tempdir())

## Download the zip file 
download.file("http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&downfile=data%2Fnrg_105a.sdmx.zip", tf)

## Unzip it in the temp folder
xml_files <- unzip(tf, exdir = tdir)

## Parse the first file
doc <- xmlInternalTreeParse(xml_files[1])

## Delete temporary files
unlink(tdir, T, T)

这篇关于使用 R 下载和读取压缩的 xml 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆