使用可变长度的嵌套URL在R中下载多个文件 [英] Downloading multiple files in R with variable length, nested URLs
问题描述
这里是新成员。尝试从R中的网站下载大量文件(但也可以接受建议,例如wget。)
New member here. Trying to download a large number of files from a website in R (but open to suggestions as well, such as wget.)
从这篇文章,我知道我必须创建一个具有所需URL的向量。我最初的问题是编写此向量,因为我在每个州中有27个州和34个代理商。我必须为所有州的每个代理商下载一个文件。状态代码始终为两个字符,而代理商代码为2至7个字符长。 URL如下所示:
From this post, I understand I must create a vector with the desired URLs. My initial problem is to write this vector, since I have 27 states and 34 agencies within each state. I must download one file for each agency for all states. Whereas the state codes are always two characters, the agency codes are 2 to 7 characters long. The URLs would look like this:
http://website.gov/xx_yyyyyyy.zip
其中 xx
是州代码, yyyyyyy
代理商代码,长度介于2到7个字符之间。我不知道如何建立这样的循环。
where xx
is the state code and yyyyyyy
the agency code, between 2 and 7 characters long. I am lost as to how to build one such loop.
我假设我可以使用以下功能下载此网址列表:
I assume I can then download this url list with the following function:
for(i in 1:length(url)){
download.file(urls, destinations, mode="wb")}
这有意义吗?
(免责声明:较早此帖子的版本先前已上传,但不完整。我的错,对不起!)
(Disclaimer: an earlier version of this post was uploaded earlier but incomplete. My mistake, sorry!)
推荐答案
这将分批下载它们并利用如果安装的R中提供了 libcurl
选项,则 download.file()
的同时下载功能的速度更快:
This will download them in batches and take advantage of the speedier simultaneous downloading capabilities of download.file()
if the libcurl
option is available on your installation of R:
library(purrr)
states <- state.abb[1:27]
agencies <- c("AID", "AMBC", "AMTRAK", "APHIS", "ATF", "BBG", "DOJ", "DOT",
"BIA", "BLM", "BOP", "CBFO", "CBP", "CCR", "CEQ", "CFTC", "CIA",
"CIS", "CMS", "CNS", "CO", "CPSC", "CRIM", "CRT", "CSB", "CSOSA",
"DA", "DEA", "DHS", "DIA", "DNFSB", "DOC", "DOD", "DOE", "DOI")
walk(states, function(x) {
map(x, ~sprintf("http://website.gov/%s_%s.zip", ., agencies)) %>%
flatten_chr() -> urls
download.file(urls, basename(urls), method="libcurl")
})
这篇关于使用可变长度的嵌套URL在R中下载多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!