使用php获取缩短网址的最终网址(如bit.ly) [英] Getting final urls of shortened urls (like bit.ly) using php
问题描述
[在底部更新]
大家好。
[Updated At Bottom]
Hi everyone.
从短网址开始:
假设您有一组5个短网址a href =http://bit.ly =nofollow> http://bit.ly ),如下所示:
$shortUrlArray = array("http://bit.ly/123",
"http://bit.ly/123",
"http://bit.ly/123",
"http://bit.ly/123",
"http://bit.ly/123");
以最终重新导向的网址结尾:
我如何得到这些短url的最终url与php?像这样:
End with Final, Redirected URLs:
How can I get the final url of these short urls with php? Like this:
http ://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some -page.html
http:/ /www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
http://www.example.com/some-directory/some-page.html
我有一个方法(在线找到),效果很好与单个url,但是当循环在多个url,它只适用于数组中的最终url。供您参考,方法如下:
I have one method (found online) that works well with a single url, but when looping over multiple urls, it only works with the final url in the array. For your reference, the method is this:
function get_web_page( $url )
{
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => true, // return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "spider", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
//$header['errno'] = $err;
//$header['errmsg'] = $errmsg;
//$header['content'] = $content;
print($header[0]);
return $header;
}
//Using the above method in a for loop
$finalURLs = array();
$lineCount = count($shortUrlArray);
for($i = 0; $i <= $lineCount; $i++){
$singleShortURL = $shortUrlArray[$i];
$myUrlInfo = get_web_page( $singleShortURL );
$rawURL = $myUrlInfo["url"];
array_push($finalURLs, $rawURL);
}
关闭,但不够
此方法可行,但只能使用一个网址。我不能在一个for循环中使用它,这是我想做的。当在上面的例子中在for循环中使用时,前四个元素不变地返回,只有最后一个元素被转换为其最终的url。无论您的数组是5个元素还是500个元素,都会发生这种情况。
Close, but not enough
This method works, but only with a single url. I Can't use it in a for loop which is what I want to do. When used in the above example in a for loop, the first four elements come back unchanged, and only the final element is converted into its final url. This happens whether your array is 5 elements or 500 elements long.
解决方案:
请给我一个提示,说明如何修改此方法在一个for循环中使用的urls(而不只是一个)的集合。
Solution Sought:
Please give me a hint as to how you'd modify this method to work when used inside of a for loop with collection of urls (Rather than just one).
-OR -
如果您知道更适合此任务的代码,请将其你的答案。
If you know of code that is better suited for this task, please include it in your answer.
提前感谢。
更新:
经过一些进一步的刺激,我发现问题不在上述方法(毕竟,似乎在for循环工作正常),但可能编码。当我硬编码一个短网址的数组,循环工作正常。但是当我使用GET或POST从一个html表单中传入一个换行符分隔的URL块时,会出现上述问题。在我提交表单时,网址是否以某种方式更改为与方法不兼容的格式?
Update:
After some further prodding I've found that the problem lies not in the above method (which, after all, seems to work fine in for loops) but possibly encoding. When I hard-code an array of short urls, the loop works fine. But when I pass in a block of newline-seperated urls from an html form using GET or POST, the above mentioned problem ensues. Are the urls somehow being changed into a format not compatible with the method when I submit the form????
新更新: />
你们,我发现我的问题是由于与上述方法无关的东西。我的问题是,我的短url的URL编码转换我认为只是换行字符(分隔的urls)到这:%0D%0A这是一个换行符或返回字符...所有短url保存为集合中的最终url有一个ghost字符附加到尾部,从而使它不可能检索到那些只有最终的URL。我确定了幽灵字符,纠正了我的php爆炸,现在所有工作正常。抱歉和感谢。
New Update:
You guys, I've found that my problem was due to something unrelated to the above method. My problem was that the URL encoding of my short urls converted what i thought were just newline characters (separating the urls) into this: %0D%0A which is a line feed or return character... And that all short urls save for the final url in the collection had a "ghost" character appended to the tail, thus making it impossible to retrieve the final urls for those only. I identified the ghost character, corrected my php explode, and all works fine now. Sorry and thanks.
推荐答案
这可能有一些帮助:如何将字符串放入数组,用新行分割?
假设你得到POST中返回的URL,你可能会这样做:
You would probably do something like this, assuming you're getting the URLs returned in POST:
$final_urls = array();
$short_urls = explode( chr(10), $_POST['short_urls'] ); //You can replace chr(10) with "\n" or "\r\n", depending on how you get your urls. And of course, change $_POST['short_urls'] to the source of your string.
foreach ( $short_urls as $short ) {
$final_urls[] = get_web_page( $short );
}
我得到以下输出,使用 var_dump
I get the following output, using var_dump($final_urls);
and your bit.ly url:
我的源码: $ _ POST ['short_urls'] = http://bit.ly/123\http://bit.ly/123\ nhttp://bit.ly/123\hhttp://bit.ly/123\";
我也有一个错误,使用你的函数:注意:Undefined offset:0 in /var/www/test.php on line 27
第27行: print($ header [0]);
我不知道你想要什么...
I also got an error, using your function: Notice: Undefined offset: 0 in /var/www/test.php on line 27
Line 27: print($header[0]);
I'm not sure what you wanted there...
这是我的 test.php
,如果它会帮助: http://codepad.org/zI2wAOWL
Here's my test.php
, if it will help: http://codepad.org/zI2wAOWL
这篇关于使用php获取缩短网址的最终网址(如bit.ly)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!