如何从页面(php)获取所有网址 [英] How to get all urls from page (php)
本文介绍了如何从页面(php)获取所有网址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个网址,网址中的描述一个接一个地列出(类似于书签/网站列表).如何使用php从该页面获取所有网址并将其写入txt文件(每行一个,仅网址不带说明)?
I have a page with urls with descriptions listed one under another (something like bookmarks/list of sites). How do I use php to get all urls from that page and write them to txt file (one per line, only url without description)?
页面看起来像这样:
我希望脚本的txt输出看起来像这样:
And I would like script's txt output to look like this:
推荐答案
一种方法
$url="http://wwww.somewhere.com";
$data=file_get_contents($url);
$data = strip_tags($data,"<a>");
$d = preg_split("/<\/a>/",$data);
foreach ( $d as $k=>$u ){
if( strpos($u, "<a href=") !== FALSE ){
$u = preg_replace("/.*<a\s+href=\"/sm","",$u);
$u = preg_replace("/\".*/","",$u);
print $u."\n";
}
}
这篇关于如何从页面(php)获取所有网址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文