使用R下载压缩数据文件,解压缩并导入.csv [英] Using R to download zipped data file, extract, and import .csv

查看:134
本文介绍了使用R下载压缩数据文件,解压缩并导入.csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 R 从网页下载并解压缩.csv文件。

I am trying to download and extract a .csv file from a webpage using R.

此问题与使用R下载压缩数据文件,提取和导入数据

我无法让解决方案工作,但可能是由于我使用的网址。

I cannot get the solution to work, but it may be due to the web address i am using.

我正在尝试从 http://data.worldbank.org/country下载.csv文件/ united-kingdom (在下载数据下拉)

I am trying to download the .csv files from http://data.worldbank.org/country/united-kingdom (under the download data drop down)

使用@ Dirk的解决方案从上面的链接,我尝试了

Using @Dirk's solution from the link above, i tried

temp <- tempfile()
download.file("http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv",temp)
con <- unz(temp, "gbr_Country_en_csv_v2.csv")
dat <- read.table(con, header=T, skip=2)
unlink(temp)

我通过查看扩展链接页面源代码,我期望是导致问题,虽然它可以工作,如果我将其粘贴到地址栏。

I got the extended link by looking at the page source code, which I expect is causing the problems, although it works if i paste it into the address bar.

文件下载与正确的Gb

The file downloads with the correct Gb

download.file("http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv",temp)
# trying URL 'http://api.worldbank.org/v2/en/country/gbr?downloadformat=csv'
# Content type 'application/zip' length 332358 bytes (324 Kb)
# opened URL
# downloaded 324 Kb

# also tried unzip but get this warning
con <- unzip(temp, "gbr_Country_en_csv_v2.csv")
# Warning message:
# In unzip(temp, "gbr_Country_en_csv_v2.csv") :
# requested file not found in the zip file

但是这些是我手动下载文件的名称。

But these are the file names when i manually download them.

我希望有一些帮助,我错了,谢谢

I'd appreciate some help with where i am going wrong , thanks

我正在使用Windows 8,R版本3.1.0

I am using Windows 8, R version 3.1.0

推荐答案

为了让您的数据下载和解压缩,您需要设置 mode =wb

In order to get your data to download and uncompress, you need to set mode="wb"

download.file("...",temp, mode="wb")
unzip(temp, "gbr_Country_en_csv_v2.csv")
dd <- read.table("gbr_Country_en_csv_v2.csv", sep=",",skip=2, header=T)

像默认是假设一个文本文件的w。如果它是一个简单的csv文件这将是罚款。但由于它被压缩,它是一个二进制文件,因此是wb。没有wb部分,您根本无法打开zip。

It looks like the default is "w" which assumes a text files. If it was a plain csv file this would be fine. But since it's compressed, it's a binary file, hence the "wb". Without the "wb" part, you can't open the zip at all.

这篇关于使用R下载压缩数据文件,解压缩并导入.csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆