PHP:cURL并跟踪所有重定向 [英] PHP: cURL and keep track of all redirections

查看:121
本文介绍了PHP:cURL并跟踪所有重定向的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个网址,并跟踪每个单独的网址。由于某种原因,我无法做到这一点,而不进行递归cURL调用,这是不理想的。也许我错过了一些简单的选择。

  $ url =某些网址含重定向; 
$ ch = curl_init($ url); $ b $ c curl_setopt($ ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ ch,CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ ch,CURLOPT_HEADER,true);
curl_setopt($ ch,CURLOPT_NOBODY,false);
curl_setopt($ ch,CURLOPT_TIMEOUT,10);
curl_setopt($ ch,CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($ ch,CURLOPT_USERAGENT,Mozilla / 5.0(Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1)Gecko / 20061024 BonEcho / 2.0

$ html = curl_exec($ ch);
$ info = array();
if(!curl_errno($ ch))
{
$ info = curl_getinfo($ ch);
echo< pre>;
print_r($ info);
echo< / pre>;
}

我得到了这样的回应

 数组

[url] =>最后一个网址$ h
[content_type] => text / html ; charset = utf-8
[http_code] => 200
[header_size] => 1942
[request_size] => 1047
[filetime] =& 1
[ssl_verify_result] => 0
[redirect_count] => 2< ---- I WANT THesE
[total_time] => 0.799589
[namelookup_time] => 0.000741
[connect_time] => 0.104206
[pretransfer_time] => 0.104306
[size_upload] => 0
[size_download] => 49460
[speed_download] => 61856
[speed_upload] => 0
[download_content_length] => 49460
[upload_content_length] => 0
[starttransfer_time] = > 0.280781
[redirect_time] => 0.400723


方案

您有

  curl_setopt($ ch,CURLOPT_FOLLOWLOCATION,true); 

这意味着cURL将跟踪重定向并返回只有最终页面没有位置头。 p>

手动跟踪位置:

  function getWebPage($ url,$ redirectcallback = null){
$ ch = curl_init($ url);
curl_setopt($ ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ ch,CURLOPT_FOLLOWLOCATION,false);
curl_setopt($ ch,CURLOPT_HEADER,true);
curl_setopt($ ch,CURLOPT_NOBODY,false);
curl_setopt($ ch,CURLOPT_TIMEOUT,10);
curl_setopt($ ch,CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($ ch,CURLOPT_USERAGENT,Mozilla / 5.0(Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1)Gecko / 20061024 BonEcho / 2.0

$ html = curl_exec($ ch);
$ http_code = curl_getinfo($ ch,CURLINFO_HTTP_CODE);
if($ http_code == 301 || $ http_code == 302){
list($ httpheader)= explode(\r\\\
\r\\\
,$ html, 2)。
$ matches = array();
preg_match('/(Location:| URI:)(。*?)\\\
/',$ httpheader,$ matches);
$ nurl = trim(array_pop($ matches));
$ url_parsed = parse_url($ nurl);
if(isset($ url_parsed)){
if($ redirectcallback){// callback
$ redirectcallback($ nurl,$ url);
}
$ html = getWebPage($ nurl,$ redirectcallback);
}
}
return $ html;
}

function trackAllLocations($ newUrl,$ currentUrl){
echo $ currentUrl。'---> '。$ newUrl。\r\\\
;
}

getWebPage('some url with redirects','trackAllLocations');


I'm looking to cURL a URL and keep track of each individual URL it goes through. For some reason I am unable to accomplish this without doing recursive cURL calls which is not ideal. Perhaps I am missing some easy option. Thoughts?

 $url = "some url with redirects";
 $ch = curl_init($url);
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
 curl_setopt($ch, CURLOPT_HEADER, true);
 curl_setopt($ch, CURLOPT_NOBODY, false);
 curl_setopt($ch, CURLOPT_TIMEOUT, 10);
 curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
 curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1) Gecko/20061024 BonEcho/2.0");

 $html = curl_exec($ch);
 $info = array();
 if(!curl_errno($ch))
 {
      $info = curl_getinfo($ch);
      echo "<pre>";
      print_r($info);
      echo "</pre>";
 }

and I get a response like this

Array
(
    [url] => THE LAST URL THAT WAS HIT
    [content_type] => text/html; charset=utf-8
    [http_code] => 200
    [header_size] => 1942
    [request_size] => 1047
    [filetime] => -1
    [ssl_verify_result] => 0
    [redirect_count] => 2   <---- I WANT THESE
    [total_time] => 0.799589
    [namelookup_time] => 0.000741
    [connect_time] => 0.104206
    [pretransfer_time] => 0.104306
    [size_upload] => 0
    [size_download] => 49460
    [speed_download] => 61856
    [speed_upload] => 0
    [download_content_length] => 49460
    [upload_content_length] => 0
    [starttransfer_time] => 0.280781
    [redirect_time] => 0.400723
)

解决方案

You have

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

This means that cURL will follow redirects and return you only the final page with no Location header.

To follow location manually:

function getWebPage($url, $redirectcallback = null){
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
    curl_setopt($ch, CURLOPT_HEADER, true);
    curl_setopt($ch, CURLOPT_NOBODY, false);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1) Gecko/20061024 BonEcho/2.0");

    $html = curl_exec($ch);
    $http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    if ($http_code == 301 || $http_code == 302) {
        list($httpheader) = explode("\r\n\r\n", $html, 2);
        $matches = array();
        preg_match('/(Location:|URI:)(.*?)\n/', $httpheader, $matches);
        $nurl = trim(array_pop($matches));
        $url_parsed = parse_url($nurl);
        if (isset($url_parsed)) {
            if($redirectcallback){ // callback
                 $redirectcallback($nurl, $url);
            }
            $html = getWebPage($nurl, $redirectcallback);
        }
    }
    return $html;
}

function trackAllLocations($newUrl, $currentUrl){
    echo $currentUrl.' ---> '.$newUrl."\r\n";
}

getWebPage('some url with redirects', 'trackAllLocations');

这篇关于PHP:cURL并跟踪所有重定向的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆