从外部网站获取标题和元标记 [英] Getting title and meta tags from external website

查看:184
本文介绍了从外部网站获取标题和元标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想尝试弄清楚如何获取

I want to try figure out how to get the

<title>A common title</title>
<meta name="keywords" content="Keywords blabla" />
<meta name="description" content="This is the description" />

即使如果按任何顺序排列,我听说过PHP Simple HTML DOM Parser,我真的不想使用它。是否可能的解决方案,除了使用PHP Simple HTML DOM解析器。

Even though if it's arranged in any order, I've heard of the PHP Simple HTML DOM Parser but I don't really want to use it. Is it possible for a solution except using the PHP Simple HTML DOM Parser.

preg_match 如果是无效的HTML,将无法执行?

preg_match will not be able to do it if it's invalid HTML?

可以使用preg_match来执行类似这样的操作吗?

Can cURL do something like this with preg_match?

Facebook可以这样做,但是可以正确使用:

Facebook does something like this but it's properly used by using:

<meta property="og:description" content="Description blabla" />

我想要这样的东西,这样当有人张贴链接时,和元标记。如果没有元标记,那么它被忽略,或者用户可以自己设置它。(

I want something like this so that it is possible when someone posts a link, it should retrieve the title and the meta tags. If there are no meta tags, then it it ignored or the user can set it themselves (but I'll do that later on myself).

推荐答案

这应该是:

function file_get_contents_curl($url)
{
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

    $data = curl_exec($ch);
    curl_close($ch);

    return $data;
}

$html = file_get_contents_curl("http://example.com/");

//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');

//get and display what you need:
$title = $nodes->item(0)->nodeValue;

$metas = $doc->getElementsByTagName('meta');

for ($i = 0; $i < $metas->length; $i++)
{
    $meta = $metas->item($i);
    if($meta->getAttribute('name') == 'description')
        $description = $meta->getAttribute('content');
    if($meta->getAttribute('name') == 'keywords')
        $keywords = $meta->getAttribute('content');
}

echo "Title: $title". '<br/><br/>';
echo "Description: $description". '<br/><br/>';
echo "Keywords: $keywords";

这篇关于从外部网站获取标题和元标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆