使用 R 下载压缩数据文件、提取和导入数据 [英] Using R to download zipped data file, extract, and import data
问题描述
@EZGraphs 在 Twitter 上写道:许多在线 csv 被压缩.有没有办法使用 R 下载、解压缩存档并将数据加载到 data.frame 中?#Rstats"
@EZGraphs on Twitter writes: "Lots of online csvs are zipped. Is there a way to download, unzip the archive, and load the data to a data.frame using R? #Rstats"
我今天也尝试这样做,但最终只是手动下载了 zip 文件.
I was also trying to do this today, but ended up just downloading the zip file manually.
我尝试了类似的东西:
fileName <- "http://www.newcl.org/data/zipfiles/a1.zip"
con1 <- unz(fileName, filename="a1.dat", open = "r")
但我觉得好像我还有很长的路要走.有什么想法吗?
but I feel as if I'm a long way off. Any thoughts?
推荐答案
Zip 存档实际上更像是一个包含内容元数据等的文件系统".有关详细信息,请参阅 help(unzip)
.所以要做你上面勾勒的事情,你需要
Zip archives are actually more a 'filesystem' with content metadata etc. See help(unzip)
for details. So to do what you sketch out above you need to
- 创建临时文件.文件名(例如
tempfile()
) - 使用
download.file()
将文件提取到临时文件中.文件 - 使用
unz()
从 temp 中提取目标文件.文件 - 通过
unlink()
删除临时文件
- Create a temp. file name (eg
tempfile()
) - Use
download.file()
to fetch the file into the temp. file - Use
unz()
to extract the target file from temp. file - Remove the temp file via
unlink()
在代码中(感谢基本示例,但这更简单)看起来像
which in code (thanks for basic example, but this is simpler) looks like
temp <- tempfile()
download.file("http://www.newcl.org/data/zipfiles/a1.zip",temp)
data <- read.table(unz(temp, "a1.dat"))
unlink(temp)
压缩 (.z
) 或 gzipped (.gz
) 或 bzip2ed (.bz2
) 文件只是文件 以及您可以直接从连接中读取的内容.所以让数据提供者改用它:)
Compressed (.z
) or gzipped (.gz
) or bzip2ed (.bz2
) files are just the file and those you can read directly from a connection. So get the data provider to use that instead :)
这篇关于使用 R 下载压缩数据文件、提取和导入数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!