使用PHP简单HTML DOM解析器查找div与类 [英] Find div with class using PHP Simple HTML DOM Parser

查看:132
本文介绍了使用PHP简单HTML DOM解析器查找div与类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



参考本教程:



http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/



我想现在只需在源代码中找到一个带有ClearBoth框的div的内容



我用curl检索代码并创建一个简单的html dom对象:

  $ cl = curl_exec($ curl); 
$ html = new simple_html_dom();
$ html-> load($ cl);

然后我想将div的内容添加到一个名为divs的数组中:

  $ divs = $ html-> find('div [.ClearBoth Box]'); 

但是现在,当print_r的$ divs,它给了更多,尽管事实上源代码没有更多的div。



像这样:

  

[0] => simple_html_dom_node对象

[nodetype] => 1
[tag] => br
[attr ] => Array

[class] => ClearBoth


[children] => Array



[nodes] => Array



[parent] => simple_html_dom_node对象

[nodetype] => 1
[tag] => div
[attr] => Array

[class] => SocialMedia


[children] => Array

[0] => simple_html_dom_node对象

[nodetype] => 1
[tag] => iframe
[ attr] => Array

[id] => ShowFacebookButtons
[class] => SocialWeb FloatLeft
[src] => http:// www。 facebook.com/plugins/xxx
[style] => border:none; overflow:hidden; width:250px; height:70px;


[children] = & Array



[nodes] =>数组


我不明白为什么$ divs没有



这是网站源代码的一个例子:

 < div class =ClearBoth Box> 
< div>
< i class =Icon SmallIcon ProductRatingEnabledIconSmalltitle =gute peppigeQualität:Sehr empfehlenswert>< / i>
< i class =Icon SmallIcon ProductRatingEnabledIconSmalltitle =gute peppigeQualität:Sehr empfehlenswert>< / i>
< i class =Icon SmallIcon ProductRatingEnabledIconSmalltitle =gute peppigeQualität:Sehr empfehlenswert>< / i>
< i class =Icon SmallIcon ProductRatingEnabledIconSmalltitle =gute peppigeQualität:Sehr empfehlenswert>< / i>
< i class =Icon SmallIcon ProductRatingEnabledIconSmalltitle =gute peppigeQualität:Sehr empfehlenswert>< / i>

< strong class =AlignMiddle LeftSmallPadding> gute peppigeQualität< / strong> < span class =AlignMiddle>(17.03.2013)< / span>
< / div>
< div class =BottomMargin>
gute Verarbeitung,schönesDesign,
< / div>
< / div>

我做错了什么?

解决方案

使用类获取div的正确代码是:

  $ ret = $ html - >发现( 'div.foo'); 
// OR
$ ret = $ html-> find('div [class = foo]');

基本上您可以使用CSS选择器获取元素。



来源: http://simplehtmldom.sourceforge.net/manual.htm

如何查找HTML元素?部分,标签高级


I am just starting with the mentioned Parser and somehow running on problems directly with the beginning.

Referring to this tutorial:

http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/

I want now simply find in a sourcecode tne content of a div with a class ClearBoth Box

I retrieve the code with curl and create a simple html dom object:

$cl = curl_exec($curl);  
$html = new simple_html_dom();
$html->load($cl);

Then I wanted to add the content of the div into an array called divs:

$divs = $html->find('div[.ClearBoth Box]');

But now, when I print_r the $divs, it gives much more, despite the fact that the sourcecode has not more inside the div.

Like this:

Array
(
    [0] => simple_html_dom_node Object
        (
            [nodetype] => 1
            [tag] => br
            [attr] => Array
                (
                    [class] => ClearBoth
                )

            [children] => Array
                (
                )

            [nodes] => Array
                (
                )

            [parent] => simple_html_dom_node Object
                (
                    [nodetype] => 1
                    [tag] => div
                    [attr] => Array
                        (
                            [class] => SocialMedia
                        )

                    [children] => Array
                        (
                            [0] => simple_html_dom_node Object
                                (
                                    [nodetype] => 1
                                    [tag] => iframe
                                    [attr] => Array
                                        (
                                            [id] => ShowFacebookButtons
                                            [class] => SocialWeb FloatLeft
                                            [src] => http://www.facebook.com/plugins/xxx
                                            [style] => border:none; overflow:hidden; width: 250px; height: 70px;
                                        )

                                    [children] => Array
                                        (
                                        )

                                    [nodes] => Array
                                        (
                                        )

I do not understand why the $divs has not simply the code from the div?

Here is an example of the source code at the site:

<div class="ClearBoth Box">
          <div>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>
<i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i>

              <strong class="AlignMiddle LeftSmallPadding">gute peppige Qualität</strong> <span class="AlignMiddle">(17.03.2013)</span>
          </div>
          <div class="BottomMargin">
            gute Verarbeitung, schönes Design,
          </div>
        </div>

What am I doing wrong?

解决方案

The right code to get a div with class is:

$ret = $html->find('div.foo');
//OR
$ret = $html->find('div[class=foo]');

Basically you can get elements as you were using a CSS selector.

source: http://simplehtmldom.sourceforge.net/manual.htm
How to find HTML elements? section, tab Advanced

这篇关于使用PHP简单HTML DOM解析器查找div与类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆