如何通过PHP获取网页的Open Graph Protocol? [英] How to get Open Graph Protocol of a webpage by php?
问题描述
< meta property =og:urlcontent =>
< meta property =og:titlecontent =>
< meta property =og:descriptioncontent =>
< meta property =og:typecontent =>
我看到的基本方法是通过cURL获取页面并使用正则表达式解析页面。任何想法?
从HTML解析数据时,你真的不应该使用正则表达式。查看 DOMXPath查询功能。
现在,实际的代码可以是:
$ b
更好的XPath查询由Stefan Gehrig,所以代码可以缩短为:
libxml_use_internal_errors(true); //是的,如果你担心使用@和警告
$ doc = new DomDocument();
$ doc-> loadHTML($ html);
$ xpath = new DOMXPath($ doc);
$ query ='// * / meta [starts-with(@property,\'og:\')]';
$ metas = $ xpath-> query($ query);
$ rmetas = array();
foreach($ metas as $ meta){
$ property = $ meta-> getAttribute('property');
$ content = $ meta-> getAttribute('content');
$ rmetas [$ property] = $ content;
}
var_dump($ rmetas);
而不是:
$ doc = new DomDocument();
@ $ doc-> loadHTML($ html);
$ xpath = new DOMXPath($ doc);
$ query ='// * / meta';
$ metas = $ xpath-> query($ query);
$ rmetas = array();
foreach($ metas as $ meta){
$ property = $ meta-> getAttribute('property');
$ content = $ meta-> getAttribute('content');
if(!empty($ property)&& preg_match('#^ og:#',$ property)){
$ rmetas [$ property] = $ content;
}
}
var_dump($ rmetas);
PHP has a simple command to get meta tags of a webpage (get_meta_tags), but this only works for meta tags with name attributes. However, Open Graph Protocol is becoming more and more popular these days. What is the easiest way to get the values of opg from a webpage. For example:
<meta property="og:url" content="">
<meta property="og:title" content="">
<meta property="og:description" content="">
<meta property="og:type" content="">
The basic way I see is to get the page via cURL and parse it with regex. Any idea?
When parsing data from HTML, you really shouldn't use regex. Take a look at the DOMXPath Query function.
Now, the actual code could be :
[EDIT] A better query for XPath was given by Stefan Gehrig, so the code can be shortened to :
libxml_use_internal_errors(true); // Yeah if you are so worried about using @ with warnings
$doc = new DomDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
$rmetas = array();
foreach ($metas as $meta) {
$property = $meta->getAttribute('property');
$content = $meta->getAttribute('content');
$rmetas[$property] = $content;
}
var_dump($rmetas);
Instead of :
$doc = new DomDocument();
@$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//*/meta';
$metas = $xpath->query($query);
$rmetas = array();
foreach ($metas as $meta) {
$property = $meta->getAttribute('property');
$content = $meta->getAttribute('content');
if(!empty($property) && preg_match('#^og:#', $property)) {
$rmetas[$property] = $content;
}
}
var_dump($rmetas);
这篇关于如何通过PHP获取网页的Open Graph Protocol?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!