使用xpath和DOMDocument检索元素 [英] Retrieve elements with xpath and DOMDocument
本文介绍了使用xpath和DOMDocument检索元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在下面的html代码中有一个广告列表。
我需要的是一个PHP循环,以获取每个广告的以下元素:
I have a list of ads in the html code below. What I need is a PHP loop to get the folowing elements for each ad:
- ad URL(href属性为
< a>
标记) - 广告图片网址(
< img> 标签)
- 广告标题(
< div class = title>
标签的html内容)
- ad URL (href attribute of
<a>
tag) - ad image URL (src attribute of
<img>
tag) - ad title (html content of
<div class="title">
tag)
<div class="ads">
<a href="http://path/to/ad/1">
<div class="ad">
<div class="image">
<div class="wrapper">
<img src="http://path/to/ad/1/image.jpg">
</div>
</div>
<div class="detail">
<div class="title">Ad #1</div>
</div>
</div>
</a>
<a href="http://path/to/ad/2">
<div class="ad">
<div class="image">
<div class="wrapper">
<img src="http://path/to/ad/2/image.jpg">
</div>
</div>
<div class="detail">
<div class="title">Ad #2</div>
</div>
</div>
</a>
</div>
我设法通过下面的PHP代码获取了广告网址。
I managed to get the ad URL with the PHP code below.
$d = new DOMDocument();
$d->loadHTML($ads); // the variable $ads contains the HTML code above
$xpath = new DOMXPath($d);
$ls_ads = $xpath->query('//a');
foreach ($ls_ads as $ad) {
$ad_url = $ad->getAttribute('href');
print("AD URL : $ad_url");
}
但是我没有设法获得另外2个元素(图像url和标题)。知道吗?
But I didn't manage to get the 2 other elements (image url and title). Any idea?
推荐答案
通过此代码(基于Khue Vu的代码),我设法获得了所需的东西:
I managed to get what I need with this code (based on Khue Vu's code) :
$d = new DOMDocument();
$d->loadHTML($ads); // the variable $ads contains the HTML code above
$xpath = new DOMXPath($d);
$ls_ads = $xpath->query('//a');
foreach ($ls_ads as $ad) {
// get ad url
$ad_url = $ad->getAttribute('href');
// set current ad object as new DOMDocument object so we can parse it
$ad_Doc = new DOMDocument();
$cloned = $ad->cloneNode(TRUE);
$ad_Doc->appendChild($ad_Doc->importNode($cloned, True));
$xpath = new DOMXPath($ad_Doc);
// get ad title
$ad_title_tag = $xpath->query("//div[@class='title']");
$ad_title = trim($ad_title_tag->item(0)->nodeValue);
// get ad image
$ad_image_tag = $xpath->query("//img/@src");
$ad_image = $ad_image_tag->item(0)->nodeValue;
}
这篇关于使用xpath和DOMDocument检索元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文