php将所有链接转换为绝对网址 [英] php convert all links to absolute urls
问题描述
我在php中编写网站抓取工具,我已经有了可以从网站提取所有链接的代码。
一个问题:网站使用绝对和相对网址的组合。
示例(http替换为hxxp,因为我无法发布超链接):
hxxp://site.com/
site.com
site.com/index.php
hxxp:/ /site.com/hello/index.php
/hello/index.php
hxxp:// site2.com/index.php
site2.com/index.php
我无法控制链接(如果它们是绝对的/相对的),但我确实需要遵循它们。我需要将所有这些链接转换为绝对URL。
b p $ p>
//您的抓取工具已发送到此页面。
$ url ='http://example.com/page';
//上面的页面的相对链接的示例。
$ relative ='/hello/index.php';
//解析抓取工具发送到的URL。
$ url = parse_url($ url);
if(FALSE === filter_var($ relative,FILTER_VALIDATE_URL))
{
//如果链接不是有效的URL,则假定它是相对的,
//构造绝对URL。
print $ url ['scheme']。'://'.$url ['host']。'/'。
}
请查看 http_build_url 方法作为创建绝对锚点的另一种方法。
I am writing a website crawler in php and I already have code that can extract all links from a site. A problem: sites use a combination of absolute and relative urls. Examples (http replaced with hxxp as I can't post hyperlinks):
hxxp://site.com/
site.com
site.com/index.php
hxxp://site.com/hello/index.php
/hello/index.php
hxxp://site2.com/index.php
site2.com/index.php
I have no control over the links (if they are absolute/relative), but I do need to follow them. I need to convert all these links into absolute URLs. How do I do this in php?
Here's a start
// Your crawler was sent to this page.
$url = 'http://example.com/page';
// Example of a relative link of the page above.
$relative = '/hello/index.php';
// Parse the URL the crawler was sent to.
$url = parse_url($url);
if(FALSE === filter_var($relative, FILTER_VALIDATE_URL))
{
// If the link isn't a valid URL then assume it's relative and
// construct an absolute URL.
print $url['scheme'].'://'.$url['host'].'/'.ltrim($relative, '/');
}
Have a look into the http_build_url method as another way of creating an absolute anchor.
这篇关于php将所有链接转换为绝对网址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!