使用foreach循环从页面源中抓取所有类数据 [英] Using foreach loop to scrap all class data from page source
本文介绍了使用foreach循环从页面源中抓取所有类数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在这里,我正在使用DOM从网页上抓取数据。我可以报废头等舱的数据。我已经考虑了所有 review-wrapper
类。我认为它会进行迭代,但每次仅显示相似的结果。
Here I am scraping data from web page using DOM. I can scrap the data for first class. I have put for each to consider all review-wrapper
class. I think it iterate but every time it shows similar results only.
我正在取消评论,日期和费率值。
I am scrapping review, date and rate value.
示例: http://codepad.viper- 7.com/lHS9jk
代码:
<?php
libxml_use_internal_errors(true);
$html= file_get_contents('http://www.yelp.com/biz/franchino-san-francisco?start=80');
$html = escapeshellarg($html) ;
$html = nl2br($html);
$classname = 'review-wrapper';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");
foreach($results as $node)
{
$classname = 'rating-qualifier';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");
if ($results->length > 0) {
echo $review = $results->item(0)->nodeValue;
echo "<br/>";
}
$classname = 'review_comment ieSucks';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");
if ($results->length > 0) {
echo $review = $results->item(0)->nodeValue;
echo "<br/>";
}
$meta = $dom->documentElement->getElementsByTagName("meta");
echo $meta->item(0)->getAttribute('content');
echo "<br/>";
}
?>
推荐答案
您可以使用for循环来实现: / p>
you can do this by using the for loop :
<?php
libxml_use_internal_errors(true);
$html= file_get_contents('http://www.yelp.com/biz/franchino-san-francisco?start=80');
$html = escapeshellarg($html) ;
$html = nl2br($html);
for ($x=0; $x<=$results->length; $x++) {
$classname = 'rating-qualifier';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");
if ($results->length > 0) {
echo $review = $results->item($x)->nodeValue;
echo "<br/>";
}
$classname = 'review_comment ieSucks';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");
if ($results->length > 0) {
echo $review = $results->item($x)->nodeValue;
echo "<br/>";
}
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$metas = $xpath->query("//meta[@itemprop='ratingValue']");
if ($metas->length > 0) {
echo $review = $metas->item($x)->getAttribute('content');
echo "<br/>";
}
}
?>
此处演示: http://codepad.viper-7.com/C6KRW2
这篇关于使用foreach循环从页面源中抓取所有类数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文