解析复杂的URL [英] Parsing complex URLs
本文介绍了解析复杂的URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我尝试解析URL字符串列表,每工作两小时后,我没有达到任何结果,URL字符串列表如下:
$ URL_LIST =阵列(
http://google.com,
的http://本地主机:8080 /测试/工程/',
http://mail.yahoo.com',
http://www.bing.com',
http://www.phpromania.net/forum/viewtopic.php?f=24&t=7549',
https://prodgame10.alliances.commandandconquer.com/12/index.aspx',
https://prodgame10.alliances.commandandconquer.ro/12/index.aspx',
);
输出应该是:
阵列
(
[0] => .google.com
[1] => .localhost
[2] => .yahoo.com
[3] => .bing.com
[4] => .phpromania.net
[5] => .commandandconquer.com
)
什么促使我在错误区域的第一件事是在URL中超过2个点。
任何算法的例子?
这是我做的尝试:
$ URL_LIST =阵列(
http://google.com,
的http://本地主机:8080 /测试/工程/',
http://mail.yahoo.com',
http://www.bing.com',
http://www.phpromania.net/forum/viewtopic.php?f=24&t=27549',
https://prodgame10.alliances.commandandconquer.com/12/index.aspx',
); 功能尺寸($名单)
{
$ I = 0;
而($列表[++ $ i]!= NULL);
$回报我;
} 功能url_Host($名单)
{
$ LISTSIZE =大小($名单)-1;
做
{
$ =了strsize大小($列表[$ LISTSIZE]);
$分= 0;
$ dpoints = 0;
$ tmpString ='';
做
{
$ currentChar = $名单[$ LISTSIZE] [$了strsize];
如果(ORD('。')== ORD($ currentChar))
{
$ tmpString ='';
$点++;
}
否则,如果(ORD(':')== ORD($ currentChar))
{
$ tmpString =':'。
$ dpoints ++;
}
}而($列表[$ LISTSIZE] [ - $了strsize]!= NULL);
打印$ tmpString;
$ =了strsize大小($列表[$ LISTSIZE]);
$ tmpString ='';
做
{
$片= FALSE;
$ currentChar = $名单[$ LISTSIZE] [$了strsize];
如果($ dpoints→2)
{
如果(ORD('\\\\')== ORD($ curentChar))$片=真实的;
$ tmpString ='';
}
}而($列表[$ LISTSIZE] [ - $了strsize]!= NULL);
打印$ tmpString< BR />中。
}而($列表[ - $ LISTSIZE]);
} url_Host($ URL_LIST);
解决方案
您可以使用内置函数 parse_url()
如下:
函数getDomain($网址)
{
$域=破灭(,array_slice(爆炸(,parse_url($网址,PHP_URL_HOST)),-2)'。''。');
返回$域;
}
测试用例:
的foreach($ URL_LIST为$ URL){
$结果[] = getDomain($网址);
}
输出:
阵列
(
[0] => google.com
[1] =>本地主机
[2] => yahoo.com
[3] => bing.com
[4] => phpromania.net
[5] => commandandconquer.com
[6] => commandandconquer.ro
)
对于点,您可以手动$ P $他们PPEND串,就像这样:
$结果[] =。 。 getDomain($网址);
我不知道为什么你需要做到这一点,但这应该工作。
演示!
I try to parse a list of url strings, after two hours of work I don't reach any result, the list of url strings look like this:
$url_list = array(
'http://google.com',
'http://localhost:8080/test/project/',
'http://mail.yahoo.com',
'http://www.bing.com',
'http://www.phpromania.net/forum/viewtopic.php?f=24&t=7549',
'https://prodgame10.alliances.commandandconquer.com/12/index.aspx',
'https://prodgame10.alliances.commandandconquer.ro/12/index.aspx',
);
Output should be:
Array
(
[0] => .google.com
[1] => .localhost
[2] => .yahoo.com
[3] => .bing.com
[4] => .phpromania.net
[5] => .commandandconquer.com
)
The first thing what induce me in the error zone is more than 2 dots in the url. Any algorithm example?
This is what I try:
$url_list = array(
'http://google.com',
'http://localhost:8080/test/project/',
'http://mail.yahoo.com',
'http://www.bing.com',
'http://www.phpromania.net/forum/viewtopic.php?f=24&t=27549',
'https://prodgame10.alliances.commandandconquer.com/12/index.aspx',
);
function size($list)
{
$i=0;
while($list[++$i]!=NULL);
return $i;
}
function url_Host($list)
{
$listSize = size($list)-1;
do
{
$strSize = size($list[$listSize]);
$points = 0;
$dpoints = 0;
$tmpString = '';
do
{
$currentChar = $list[$listSize][$strSize];
if(ord('.')==ord($currentChar))
{
$tmpString .= '.';
$points++;
}
else if(ord(':')==ord($currentChar))
{
$tmpString .= ':';
$dpoints++;
}
}while($list[$listSize][--$strSize]!=NULL);
print $tmpString;
$strSize = size($list[$listSize]);
$tmpString = '';
do
{
$slice = false;
$currentChar = $list[$listSize][$strSize];
if($dpoints > 2)
{
if(ord('\\')==ord($curentChar)) $slice = true;
$tmpString .= '';
}
}while($list[$listSize][--$strSize]!=NULL);
print $tmpString."<br />";
}while($list[--$listSize]);
}
url_Host($url_list);
解决方案
You can use the built-in function parse_url()
as follows:
function getDomain($url)
{
$domain = implode('.', array_slice(explode('.', parse_url($url, PHP_URL_HOST)), -2));
return $domain;
}
Test cases:
foreach ($url_list as $url) {
$result[] = getDomain($url);
}
Output:
Array
(
[0] => google.com
[1] => localhost
[2] => yahoo.com
[3] => bing.com
[4] => phpromania.net
[5] => commandandconquer.com
[6] => commandandconquer.ro
)
As for the dots, you can manually prepend them to string, like so:
$result[] = "." . getDomain($url);
I'm not sure why you need to do this, but this should work.
Demo!
这篇关于解析复杂的URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文