卷曲重定向无法正常工作 [英] curl redirect not properly working

查看:110
本文介绍了卷曲重定向无法正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的函数应该获取$ url的目标URL:

My function is supposed to get the destination URL of $url:

function getUrl($url)
{
    $user_agent='Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)';
    $ch = curl_init(); 
    $timeout = 10; // set to zero for no timeout 
    curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout); 
    curl_setopt ($ch, CURLOPT_URL, $url); 
    curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
    curl_setopt ($ch, CURLOPT_HEADER, 1);
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
    $curl = curl_exec($ch);
    $header = curl_getinfo($ch);
    curl_close($ch); 
    return $header;

}

function get_url_list() {

 $url = "http://www.webliste.ch/click.aspx?nr=148252";
 $result=getUrl($url);
 print_r($result);echo "<br>";
}

get_url_list();

结果如下:

Array
(
    [url] => http://www.webliste.ch/click.aspx?nr=148252
    [content_type] => text/html; charset=iso-8859-1
    [http_code] => 200
    [header_size] => 320
    [request_size] => 139
    ...
    [redirect_time] => 0
    [certinfo] => Array
        (
        )

    [redirect_url] => 
)

我很茫然,因为URL正在重定向,并且如果我回显$ ch,我得到了重定向的网站。

I am at a loss, because the URL is redirecting, and if I echo $ch, I get the redirected website.

有人知道这是什么原因吗?

Anyone know what's the cause of this?

以下操作无效

$final_url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);

输出与$ result ['url']相同,这不是我想要的

The output is the same as $result['url'], which is not what I am looking for.

推荐答案

我已经分析了实际发生的情况,现在我看到重定向不是由该重定向标头引起的页面而是使用JavaScript即时提交表单并将您重定向到起始页面。

I've analyzed what actually happens and now I see that the redirect is not caused by a redirect header on that page instead it's with JavaScript instantly submitting a form and redirecting you to the start page.

可能很难确定页面的URL,但是您可以做的就是寻找< form> 标记,然后在其 action 属性中找到URL。

Might be hard to determine the URL of the page but what you can do is look for a <form> tag and then find the URL in its action attribute.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">
   <head id="Head1">
      <title></title>
      <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
      <meta name="ROBOTS" content="NOINDEX, NOFOLLOW" />
   </head>

   <body>
      <form id="form1" action="http://www.taxiherold.ch">
         <div id="panGo" align="center">
            <script type="text/javascript">
               document.getElementById('form1').submit();
            </script>
         </div>
      </form>
   </body>
</html>

所以现在尝试下面的代码:

So try this code now:

$ch = curl_init('http://www.webliste.ch/click.aspx?nr=148252');
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true); 
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, false);

$data = curl_exec($ch);

$dom = new DOMDocument();
@$dom->loadHTML($data);
$xpath = new DOMXPath($dom);

$url = $xpath->query('//body/form');
$url = ($url->length == 1 ? $url->item(0)->getAttribute('action') : null);


var_dump($url);

将输出:

这篇关于卷曲重定向无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆