批量文件下载自动化的最佳方法 [英] Best method for automating bulk file download

查看:197
本文介绍了批量文件下载自动化的最佳方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个cron作业,下载存储在我们数据库中的队列中的图像文件。

I am attempting to create a cron job that downloads image files that are stored in a queue in our database.

我们使用的所有函数在Web服务器上运行时都正常工作,但是当我使用以下命令运行cron作业时: php index.php cron image_download 我收到一个分段错误错误。

All of the functions that we are using work properly when run on our web server, however when I run the cron job using the following command: php index.php cron image_download I receive a Segmentation Fault error.

cron作业显示当数据传递到get_url_content函数时出现此错误,此函数在此处调用:

Debugging the cron job shows that this error occurs when the data is passed to the get_url_content function, which is called here:

foreach($urls as $url){

    $content = $this->get_url_content($url); 
}

功能如下:

function get_url_content($url){
    $agent= 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)';
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, $agent);
    curl_setopt($ch, CURLOPT_URL,$url);
    return curl_exec($ch);
}

有更好的方法来下载这些文件吗?是否可能不同的方法不会导致相同的分段错误错误?谢谢!

Is there a better way to download these files? Is it likely that a different method would not cause the same segmentation fault error? Thank you!

更新:似乎我所尝试的各种方法是不断引起问题。我看到从cron作业返回的Segmentation Fault或Killed错误。有人建议我调查使用 Iron.io 这样,所以我要检查一下。

UPDATE: It appears that various methods I am trying are continually causing issues. I am seeing either "Segmentation Fault" or "Killed" errors returned from the cron job. Someone recommended that I look into using Iron.io for this so I am going to check that out. If anyone has other recommendations for how to manage this best I would appreciate additional information, thanks.

推荐答案

您可以尝试这种方法,

You can try this approach, but before that, are you giving it the full URL?

function get_content($url){
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_VERBOSE, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_AUTOREFERER, false);
    curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    $result = curl_exec($ch);
    curl_close($ch);
    return($result);
}

function save_content($text,$new_filename){
    $fp = fopen($new_filename, 'w');
    fwrite($fp, $text);
    fclose($fp);
}

// replace this with your array of urls from the database (make sure it is an array)
$urls = ['http://domain.com/path/to/file.zip', 'http://another.com/path/to/image.img'];

foreach($urls as $url){
    $new_filename = basename($url);
    $temp = get_content($url);
    save_content($temp,$new_filename);
}



这将通过其完整的URL获取文件内容并将其保存到磁盘,

This would get the file contents via its complete url and save it to disk, thus the being downloaded.

如果您不限于curl,您可以尝试:

If you are not limited to curl, you may try something like:

$urls = ['http://domain.com/path/to/file.zip', 'http://another.com/path/to/image.img'];

foreach($urls as $url){
    $new_filename = basename($url);
    // or fopen can be file_get_contents: file_get_contents($url)
    file_put_contents($new_filename, fopen($url, 'r'));
}

或甚至

foreach($urls as $url){
    $new_filename = basename($url);
    shell_exec("wget $url -O $new_filename");
}

这篇关于批量文件下载自动化的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆