使用php获取缩短网址的最终网址(如bit.ly) [英] Getting final urls of shortened urls (like bit.ly) using php

查看:280
本文介绍了使用php获取缩短网址的最终网址(如bit.ly)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

[在底部更新]

大家好。

[Updated At Bottom]
Hi everyone.

从短网址开始

假设您有一组5个短网址a href =http://bit.ly =nofollow> http://bit.ly ),如下所示:

$shortUrlArray = array("http://bit.ly/123",
"http://bit.ly/123",
"http://bit.ly/123",
"http://bit.ly/123",
"http://bit.ly/123");

以最终重新导向的网址结尾

我如何得到这些短url的最终url与php?像这样:

End with Final, Redirected URLs:
How can I get the final url of these short urls with php? Like this:

http ://www.example.com/some-directory/some-page.html

http://www.example.com/some-directory/some-page.html

http://www.example.com/some-directory/some-page.html

http://www.example.com/some-directory/some -page.html

http:/ /www.example.com/some-directory/some-page.html

http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html

我有一个方法(在线找到),效果很好与单个url,但是当循环在多个url,它只适用于数组中的最终url。供您参考,方法如下:

I have one method (found online) that works well with a single url, but when looping over multiple urls, it only works with the final url in the array. For your reference, the method is this:

function get_web_page( $url ) 
{ 
    $options = array( 
        CURLOPT_RETURNTRANSFER => true,     // return web page 
        CURLOPT_HEADER         => true,    // return headers 
        CURLOPT_FOLLOWLOCATION => true,     // follow redirects 
        CURLOPT_ENCODING       => "",       // handle all encodings 
        CURLOPT_USERAGENT      => "spider", // who am i 
        CURLOPT_AUTOREFERER    => true,     // set referer on redirect 
        CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect 
        CURLOPT_TIMEOUT        => 120,      // timeout on response 
        CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects 
    ); 

    $ch      = curl_init( $url ); 
    curl_setopt_array( $ch, $options ); 
    $content = curl_exec( $ch ); 
    $err     = curl_errno( $ch ); 
    $errmsg  = curl_error( $ch ); 
    $header  = curl_getinfo( $ch ); 
    curl_close( $ch ); 

    //$header['errno']   = $err; 
    //$header['errmsg']  = $errmsg; 
    //$header['content'] = $content; 
    print($header[0]); 
    return $header; 
}  


//Using the above method in a for loop

$finalURLs = array();

$lineCount = count($shortUrlArray);

for($i = 0; $i <= $lineCount; $i++){

    $singleShortURL = $shortUrlArray[$i];

    $myUrlInfo = get_web_page( $singleShortURL ); 

    $rawURL = $myUrlInfo["url"];

    array_push($finalURLs, $rawURL);

}

关闭,但不够

此方法可行,但只能使用一个网址。我不能在一个for循环中使用它,这是我想做的。当在上面的例子中在for循环中使用时,前四个元素不变地返回,只有最后一个元素被转换为其最终的url。无论您的数组是5个元素还是500个元素,都会发生这种情况。

Close, but not enough
This method works, but only with a single url. I Can't use it in a for loop which is what I want to do. When used in the above example in a for loop, the first four elements come back unchanged, and only the final element is converted into its final url. This happens whether your array is 5 elements or 500 elements long.

解决方案:

请给我一个提示,说明如何修改此方法在一个for循环中使用的urls(而不只是一个)的集合。

Solution Sought:
Please give me a hint as to how you'd modify this method to work when used inside of a for loop with collection of urls (Rather than just one).

-OR -

如果您知道更适合此任务的代码,请将其你的答案。

If you know of code that is better suited for this task, please include it in your answer.

提前感谢。

更新:

经过一些进一步的刺激,我发现问题不在上述方法(毕竟,似乎在for循环工作正常),但可能编码。当我硬编码一个短网址的数组,循环工作正常。但是当我使用GET或POST从一个html表单中传入一个换行符分隔的URL块时,会出现上述问题。在我提交表单时,网址是否以某种方式更改为与方法不兼容的格式?

Update:
After some further prodding I've found that the problem lies not in the above method (which, after all, seems to work fine in for loops) but possibly encoding. When I hard-code an array of short urls, the loop works fine. But when I pass in a block of newline-seperated urls from an html form using GET or POST, the above mentioned problem ensues. Are the urls somehow being changed into a format not compatible with the method when I submit the form????

新更新: />
你们,我发现我的问题是由于与上述方法无关的东西。我的问题是,我的短url的URL编码转换我认为只是换行字符(分隔的urls)到这:%0D%0A这是一个换行符或返回字符...所有短url保存为集合中的最终url有一个ghost字符附加到尾部,从而使它不可能检索到那些只有最终的URL。我确定了幽灵字符,纠正了我的php爆炸,现在所有工作正常。抱歉和感谢。

New Update:
You guys, I've found that my problem was due to something unrelated to the above method. My problem was that the URL encoding of my short urls converted what i thought were just newline characters (separating the urls) into this: %0D%0A which is a line feed or return character... And that all short urls save for the final url in the collection had a "ghost" character appended to the tail, thus making it impossible to retrieve the final urls for those only. I identified the ghost character, corrected my php explode, and all works fine now. Sorry and thanks.

推荐答案

这可能有一些帮助:如何将字符串放入数组,用新行分割?

假设你得到POST中返回的URL,你可能会这样做:

You would probably do something like this, assuming you're getting the URLs returned in POST:

$final_urls = array();

$short_urls = explode( chr(10), $_POST['short_urls'] ); //You can replace chr(10) with "\n" or "\r\n", depending on how you get your urls. And of course, change $_POST['short_urls'] to the source of your string.

foreach ( $short_urls as $short ) {
    $final_urls[] = get_web_page( $short );
}



我得到以下输出,使用 var_dump

I get the following output, using var_dump($final_urls); and your bit.ly url:

http://codepad.org/8YhqlCo1

我的源码: $ _ POST ['short_urls'] = http://bit.ly/123\http://bit.ly/123\ nhttp://bit.ly/123\hhttp://bit.ly/123\";

我也有一个错误,使用你的函数:注意:Undefined offset:0 in /var/www/test.php on line 27 第27行: print($ header [0]); 我不知道你想要什么...

I also got an error, using your function: Notice: Undefined offset: 0 in /var/www/test.php on line 27 Line 27: print($header[0]); I'm not sure what you wanted there...

这是我的 test.php ,如果它会帮助: http://codepad.org/zI2wAOWL

Here's my test.php, if it will help: http://codepad.org/zI2wAOWL

这篇关于使用php获取缩短网址的最终网址(如bit.ly)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆