下载多个网址的最快方法 [英] Fastest way to download multiple urls

查看:62
本文介绍了下载多个网址的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个门户网站,需要下载许多单独的 json 文件并在某种表单视图中显示它们的内容.很多我的意思是最少 32 个单独的文件.

I have a web portal that needs to download many of separate json files and display their contents in a sort of form view. By lots I mean 32 separate files minimum.

我用蛮力迭代尝试了 cUrl,它需要大约 12.5 秒.

I've tried cUrl with brute force iteration and its taking ~12.5 seconds.

我已经尝试过 curl_multi_exec,如下所示 http://www.php.net/manual/en/function.curl-multi-init.php 使用下面的函数,耗时约 9 秒.好一点,但仍然非常慢.

I've tried curl_multi_exec as demonstrated here http://www.php.net/manual/en/function.curl-multi-init.php with the function below and its taking ~9 seconds. A little better but still terribly slow.

function multiple_threads_request($nodes){
    $mh = curl_multi_init();
    $curl_array = array();
    foreach($nodes as $i => $url)
    {
        $curl_array[$i] = curl_init($url);
        curl_setopt($curl_array[$i], CURLOPT_RETURNTRANSFER, true);
        curl_multi_add_handle($mh, $curl_array[$i]);
    }
    $running = NULL;
    do {
        curl_multi_exec($mh,$running);
    } while($running > 0);

    $res = array();
    foreach($nodes as $i => $url)
    {
        $res[$url] = curl_multi_getcontent($curl_array[$i]);
    }

    foreach($nodes as $i => $url){
        curl_multi_remove_handle($mh, $curl_array[$i]);
    }
    curl_multi_close($mh);
    return $res;
}

我意识到这本身就是一个昂贵的操作,但有没有人知道其他可能更快的替代方法?

I realize this is an inherently expensive operation but does anyone know any other alternatives that might faster?

最后,我的系统限制了 curl_multi_exec 并将代码移动到生产机器上看到了巨大的改进

In the end, my system was limiting curl_multi_exec and moving the code to a production machine saw dramatic improvements

推荐答案

您绝对应该研究对 cURL 进行基准测试,以查看哪个 cURL 速度变慢,但这对于评论来说太长了,所以请告诉我它是否有帮助:

You should definitely look into benchmarking your cURLs to see which one has the slowdown but this was too lengthy for a comment so let me know if it helps or not:

// revert to "cURLing with brute force iteration" as you described it :)

$curl_timer = array();

foreach($curlsite as $row)
{
    $start = microtime(true);

    /**
     * curl code
     */

    $curl_timer[] = (microtime(true)-$start);
}

echo '<pre>'.print_r($curl_timer, true).'</pre>';

这篇关于下载多个网址的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆