在R中并行下载多个文件 [英] Downloading multiple file as parallel in R

查看:153
本文介绍了在R中并行下载多个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从ftp服务器(从TRMM存档数据中获取)下载460,000个文件。我列出了所有文件并将它们分成不同的作业,但是任何人都可以帮助我如何在R中同时运行这些作业。这只是我尝试做的一个例子

I am trying to download 460,000 files from ftp server ( which I got from the TRMM archive data). I made a list of all files and separated them into different jobs, but can any one help me how to run those jobs at the same time in R. Just an example what I have tried to do

my.list <-readLines("1998-2010.txt") # lists the ftp address of each file
job1 <- for (i in 1: 1000) { 
            download.file(my.list[i], name[i], mode = "wb")
        }
job2 <- for (i in 1001: 2000){ 
            download.file(my.list[i], name[i], mode = "wb")
        }
job3 <- for (i in 2001: 3000){ 
            download.file(my.list[i], name[i], mode = "wb")
        }

现在我被困在如何同时运行所有作业上。

Now I m stuck on how to run all of the Jobs at the same time.

感谢您的帮助

推荐答案

不要这样做。真。别。不会再快了,因为限制因素将是网络速度。您最终将获得大量甚至更慢的下载,然后服务器将放弃并甩开您,最终您将获得大量的一半下载的文件。

Dont do that. Really. Dont. It won't be any faster because the limiting factor is going to be the network speed. You'll just end up with a large number of even slower downloads, and then the server will just give up and throw you off, and you'll end up with a large number of half-downloaded files.

下载多个文件也将增加磁盘负载,因为现在您的PC试图保存大量文件。

Downloading multiple files will also increase the disk load since now your PC is trying to save a large number of files.

这是另一种解决方案。

使用R(或其他一些工具,其一行awk脚本从您的列表)以编写如下所示的HTML文件:

Use R (or some other tool, its one line of awk script starting from your list) to write an HTML file which just looks like this:

<a href="ftp://example.com/path/file-1.dat">file-1.dat</a>
<a href="ftp://example.com/path/file-2.dat">file-2.dat</a>

,依此类推。现在,在网络浏览器中打开此文件,并使用下载管理器(例如,对于Firefox, DownThemAll )并告诉它下载所有链接。您可以使用DownThemAll指定同时进行的下载次数,失败的重试次数等等。

and so on. Now open this file in your web browser and use a download manager (eg DownThemAll for Firefox) and tell it to download all the links. You can specify how many simultaneous downloads, how many times to retry fails and so on with DownThemAll.

这篇关于在R中并行下载多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆