从网站上的文件夹下载所有文件 [英] Download All Files From a Folder on a Website
问题描述
我的问题是在R中如何下载网站上的所有文件?我知道如何一一但不是一次全部做到。例如:
My question is in R how to download all the files on a website? I know how to do it one by one but not all at one time. For example:
http://www2.census.gov/geo/docs/maps-data/data/rel/t00t10/
推荐答案
我在页面上56个文件的一小部分(3)上对此进行了测试,并且工作正常。
I tested this on a small subset (3) of the 56 files on the page, and it works fine.
## your base url
url <- "http://www2.census.gov/geo/docs/maps-data/data/rel/t00t10/"
## query the url to get all the file names ending in '.zip'
zips <- XML::getHTMLLinks(
url,
xpQuery = "//a/@href['.zip'=substring(., string-length(.) - 3)]"
)
## create a new directory 'myzips' to hold the downloads
dir.create("myzips")
## save the current directory path for later
wd <- getwd()
## change working directory for the download
setwd("myzips")
## create all the new files
file.create(zips)
## download them all
lapply(paste0(url, zips), function(x) download.file(x, basename(x)))
## reset working directory to original
setwd(wd)
现在所有的zip文件都位于目录 myzips
中,并且已经准备好进行进一步处理。作为 lapply()
的替代方法,您还可以使用 for()
循环。
Now all the zip files are in the directory myzips
and are ready for further processing. As an alternative to lapply()
you could also use a for()
loop.
## download them all
for(u in paste0(url, zips)) download.file(u, basename(u))
当然,设置 quiet = TRUE
可能是很好,因为我们正在下载56个文件。
And of course, setting quiet = TRUE
may be nice since we're downloading 56 files.
这篇关于从网站上的文件夹下载所有文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!