从外部网站获取标题和元标记 [英] Getting title and meta tags from external website
问题描述
我想尝试弄清楚如何获取
I want to try figure out how to get the
<title>A common title</title>
<meta name="keywords" content="Keywords blabla" />
<meta name="description" content="This is the description" />
即使如果按任何顺序排列,我听说过PHP Simple HTML DOM Parser,我真的不想使用它。是否可能的解决方案,除了使用PHP Simple HTML DOM解析器。
Even though if it's arranged in any order, I've heard of the PHP Simple HTML DOM Parser but I don't really want to use it. Is it possible for a solution except using the PHP Simple HTML DOM Parser.
preg_match
如果是无效的HTML,将无法执行?
preg_match
will not be able to do it if it's invalid HTML?
可以使用preg_match来执行类似这样的操作吗?
Can cURL do something like this with preg_match?
Facebook可以这样做,但是可以正确使用:
Facebook does something like this but it's properly used by using:
<meta property="og:description" content="Description blabla" />
我想要这样的东西,这样当有人张贴链接时,和元标记。如果没有元标记,那么它被忽略,或者用户可以自己设置它。(
I want something like this so that it is possible when someone posts a link, it should retrieve the title and the meta tags. If there are no meta tags, then it it ignored or the user can set it themselves (but I'll do that later on myself).
推荐答案
这应该是:
function file_get_contents_curl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$html = file_get_contents_curl("http://example.com/");
//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
$metas = $doc->getElementsByTagName('meta');
for ($i = 0; $i < $metas->length; $i++)
{
$meta = $metas->item($i);
if($meta->getAttribute('name') == 'description')
$description = $meta->getAttribute('content');
if($meta->getAttribute('name') == 'keywords')
$keywords = $meta->getAttribute('content');
}
echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords";
这篇关于从外部网站获取标题和元标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!