PHP从xml获取img src [英] PHP get img src from xml

查看:216
本文介绍了PHP从xml获取img src的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个xml的页面如下所示:

I have a page with xml that looks like:

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0">
  <channel>
    <title>FB-RSS feed for Salman Khan  Fc</title>
    <link>http://facebook.com/profile.php?id=1636293749919827/</link>
    <description>FB-RSS feed for Salman Khan  Fc</description>
    <managingEditor>http://fbrss.com (FB-RSS)</managingEditor>
    <pubDate>31 Mar 16 20:00 +0000</pubDate>
    <item>
      <title>Photo - Who is the Best Khan ?</title>
      <link>https://www.facebook.com/SalmanKhanFns/photos/a.1639997232882812.1073741827.1636293749919827/1713146978901170/?type=3</link>
      <description>&lt;a href=&#34;https://www.facebook.com/SalmanKhanFns/photos/a.1639997232882812.1073741827.1636293749919827/1713146978901170/?type=3&#34;&gt;&lt;img src=&#34;https://scontent.xx.fbcdn.net/hphotos-xap1/v/t1.0-0/s130x130/11059765_1713146978901170_8711054263905505442_n.jpg?oh=fa2978c5ecfb3ae424e9082aaa057b8f&amp;oe=57BB41D5&#34;&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;Who is the Best Khan ?</description>
      <author>FB-RSS</author>
      <guid>1636293749919827_1713146978901170</guid>
      <pubDate>31 Mar 16 20:00 +0000</pubDate>
    </item>
    <item>
      <title>Photo</title>
      <link>https://www.facebook.com/SalmanKhanFns/photos/a.1636293813253154.1073741825.1636293749919827/1713146755567859/?type=3</link>
      <description>&lt;a href=&#34;https://www.facebook.com/SalmanKhanFns/photos/a.1636293813253154.1073741825.1636293749919827/1713146755567859/?type=3&#34;&gt;&lt;img src=&#34;https://scontent.xx.fbcdn.net/hphotos-xap1/v/t1.0-0/s130x130/12294686_1713146755567859_6728330714340999478_n.jpg?oh=6d90a688fdf4342f9e12e9ff9a66b127&amp;oe=57778068&#34;&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;</description>
      <author>FB-RSS</author>
      <guid>1636293749919827_1713146755567859</guid>
      <pubDate>31 Mar 16 19:58 +0000</pubDate>
    </item>
  </channel>
</rss>

我想得到 src 上面的 xml 中的 img >。

I want to get the srcs of the imgs in the xml above.

图像存储在< description> 中但是,它们的格式不是

The images are stored in the <description> however, they are not in the format of

< img ...

他们看起来像:

& lt; img src =&#34; https://scontent.xx.fbc ...

< 替换为& lt; ...我想这就是为什么 $ imgs = $ dom-> getElementsByTagName('img'); 什么都不返回。

the < is replace with &lt;... I guess thats why $imgs = $dom->getElementsByTagName('img'); returns nothing.

有什么工作吗?

这就是我所说的:

libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadXML( $xml_file);
$imgs = ...(get the imgs to extract the src...('img') ??;

//Then run a possible foreach
//something like:

foreach($imgs as $img){

   $src= ///the src of the $img

   //try it out
   echo '<img src="'.$src.'" /> <br />',
}

任何想法?

推荐答案

您已嵌入HTML XML标记,因此您必须检索XML节点,加载每个HTML并检索所需的标记属性。

You have HTML embedded in XML tags, so you have to retrieve XML nodes, load each HTML and retrieve desired tag attribute.

在您的XML中有不同的< description> ; 节点,因此使用 - > getElementsByTagName 将返回超过所需节点的数量。使用 DOMXPath 仅检索< description> 右侧树位置的节点:

In your XML there are different <description> nodes, so using ->getElementsByTagName will return more than your desired nodes. Use DOMXPath to retrieve only <description> nodes in the right tree position:

$dom = new DOMDocument();
libxml_use_internal_errors( True );
$dom->loadXML( $xml );
$dom->formatOutput = True;

$xpath = new DOMXPath( $dom );
$nodes = $xpath->query( 'channel/item/description' );

然后迭代所有节点,在新的 DOMDocument中加载节点值(不需要解码html实体,DOM已经为你解码了),并从< img>中提取 src 属性 node:

Then iterate all nodes, load node value in a new DOMDocument (no need to decode html entities, DOM already decodes it for you), and extract src attribute from <img> node:

foreach( $nodes as $node )
{
    $html = new DOMDocument();
    $html->loadHTML( $node->nodeValue );
    $src = $html->getElementsByTagName( 'img' )->item(0)->getAttribute('src');
}

eval.in演示

这篇关于PHP从xml获取img src的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆