使用R下载压缩数据文件,提取和导入数据 [英] Using R to download zipped data file, extract, and import data
问题描述
@EZGraphs在Twitter上写道:
许多在线csv已压缩。是否有一种方法可以使用R下载,解压缩档案并将数据加载到data.frame中?#Rstats
@EZGraphs on Twitter writes: "Lots of online csvs are zipped. Is there a way to download, unzip the archive, and load the data to a data.frame using R? #Rstats"
我今天也试图这样做,但最终只是手动下载了zip文件。
I was also trying to do this today, but ended up just downloading the zip file manually.
我尝试了类似的方法:
fileName <- "http://www.newcl.org/data/zipfiles/a1.zip"
con1 <- unz(fileName, filename="a1.dat", open = "r")
但是我觉得我还有很长的路要走。
有什么想法吗?
but I feel as if I'm a long way off. Any thoughts?
推荐答案
Zip归档实际上更是带有内容元数据等的文件系统。请参见 help(unzip)
了解详情。因此,要完成上面概述的工作,就需要
Zip archives are actually more a 'filesystem' with content metadata etc. See help(unzip)
for details. So to do what you sketch out above you need to
- 创建一个临时工。文件名(例如
tempfile()
) - 使用
download.file()
将文件提取到临时文件中。文件 - 使用
unz()
从temp中提取目标文件。文件 - 通过
unlink()
- Create a temp. file name (eg
tempfile()
) - Use
download.file()
to fetch the file into the temp. file - Use
unz()
to extract the target file from temp. file - Remove the temp file via
unlink()
$删除临时文件b $ b
在代码中(感谢基本示例,但这更简单)看起来像
which in code (thanks for basic example, but this is simpler) looks like
temp <- tempfile()
download.file("http://www.newcl.org/data/zipfiles/a1.zip",temp)
data <- read.table(unz(temp, "a1.dat"))
unlink(temp)
压缩( .z
)或gzip( .gz
)或bzip2ed( .bz2
)文件是文件,您可以直接从连接中读取。因此,请数据提供者改用它:)
Compressed (.z
) or gzipped (.gz
) or bzip2ed (.bz2
) files are just the file and those you can read directly from a connection. So get the data provider to use that instead :)
这篇关于使用R下载压缩数据文件,提取和导入数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!