PHP简单的HTML DOM解析器 [英] PHP Simple HTML DOM Parser

查看:105
本文介绍了PHP简单的HTML DOM解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚开始使用 PHP Simple HTML DOM Parser

现在我正在尝试提取包含< b> -tag的所有元素,包含< / b> 从一个exsiting HTML文档。这样可以正常工作。

Now I'm trying to extract all elements surrounded with a <b>-tag inclduing </b> from an exsiting HTML document. This works fine with

foreach($html->find('b') as $q)
    echo $q;

如何才能实现仅显示由< b> ; < / b> -tags后跟一个< span class =marked> code>?

How can I achieve to show up only elements surrounded with the <b>,</b>-tags followed by a <span class="marked">?

更新:
我已经使用firebug获取元素的css路径。现在看起来像这样:

Update: I've used firebug to get the css path for the elements. Now it looks like this:

foreach ($html->find('html body div#wrapper table.desc tbody tr td div span.marked') as $x)
    foreach ($x->find('html body div#wrapper table.desc tbody tr td table.split tbody tr td b') as $d)
        echo $d;

但它不会工作...任何想法?

But it won't work... Any Ideas?

更新:

为了在这里澄清我的问题,请使用起始表和结束表标签的文档示例。

To clarify my question here a sample tr of the document with starting table and ending table tags.

<table width="100%" border="0" cellspacing="0" cellpadding="0" class="desc">
    <tr>
        <th width="25%" scope="col"><div align="center">1</div></th>
        <th width="50" scope="col"><div align="center">2</div></th>
        <th width="10%" scope="col"><div align="center">3</div></th>
        <th width="15%" scope="col"><div align="center">4</div></th>
    </tr>
    <tr>
        <td valign="top" bgcolor="#E9E9E9"><div style="text-align: center; font-weight: bold; margin-top: 2px"> 1 </div></td>
        <td>
            <table width="100%" border="0" cellspacing="0" cellpadding="0" class="split">  <tr>
                    <td>
                        <b> element to extract</b></td>
                </tr>
                <tr>
                    <td>
                        <table width="100%" border="0" cellspacing="0" cellpadding="0" class="split">  <tr>
                                <td width="15px" valign="top">&nbsp;</td>
                                <td width="15px" valign="top">  
                                    <div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
                                        1
                                    </div>
                                </td>
                                <td>
                                    abed
                                </td>
                            </tr>
                            <tr>
                                <td width="15px" valign="top">&nbsp;</td>
                                <td width="15px" valign="top">  
                                    <div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
                                        2
                                    </div>
                                </td>
                                <td>
                                    ddee
                                </td>
                            </tr>
                            <tr>
                                <td width="15px" valign="top">&nbsp;</td>
                                <td width="15px" valign="top">  
                                    <div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
                                        3
                                    </div>
                                </td>
                                <td>
                                    xdef
                                </td>
                            </tr>
                            <tr>
                                <td width="15px" valign="top">&nbsp;</td>
                                <td width="15px" valign="top">
                                    <div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
                                        4
                                    </div>
                                </td>
                                <td>
                                    abbcc
                                </td>
                            </tr>
                            <tr>
                                <td width="15px" valign="top">&nbsp;</td>
                                <td width="15px" valign="top">  
                                    <div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
                                        5
                                    </div>
                                </td>
                                <td>
                                    ab
                                </td>
                            </tr>
                            <tr>
                                <td width="15px" valign="top">&nbsp;</td>
                                <td width="15px" valign="top">  
                                    <div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
                                        6
                                    </div>
                                </td>
                                <td>
                                    e1
                                </td>
                            </tr>
                        </table>
                    </td>
                </tr>
            </table>
        </td>
        <td valign="top"><div style="text-align: center"> <span class="marked">marked</span> </div></td>
        <td valign="top"><div style="text-align: center">  </div></td>
    </tr>
</table>


推荐答案

尝试以下 CSS选择器

b > span.marked

这将返回跨度,所以你可能要做 $ e-> parent()以获取b元素。

That would return the span though, so you probably have to do $e->parent() to get to the b element.

另请参阅解析HTML的最佳方法 for SimpleHtmlDom的替代方法

Also see Best Methods to parse HTML for alternatives to SimpleHtmlDom

更新后修改:

您的浏览器将修改DOM 如果您查看标记,您将看到没有tbody元素。但是Firebug给你

Your browser will modify the DOM. If you look at your markup, you will see that there is no tbody elements. Yet Firebug gives you

html body div#wrapper table.desc tbody tr td div span.marked'
html body div#wrapper table.desc tbody tr td table.split tbody tr td b'

另外,你的问题不符合查询。你问如何找到< b>,< / b>

Also, your question does not match the queries. You asked how to find


c $ c> -tags后跟一个< span class =marked>

可以读取,意思是

<b><span class="marked">foo</span></b>

<b><element>foo</element></b><span class="marked">foo</span>

首先使用子组合器我已经显示了。第二,使用相邻兄弟组合器

For that first use the child combinator I have shown earlier. For the second, use the adjacent sibling combinator

b + span.marked

获取跨度,然后使用 $ e-> prev_sibling()返回元素的上一个同级元素(如果未找到则为null)。

to get the span and then use $e->prev_sibling() to return the previous sibling of element (or null if not found).

但是,在您显示的标记中,既没有也没有。只有DIV与SPAN孩子有标记的类

However, in your shown markup, there is neither nor. There is only a DIV with a SPAN child having the marked class

<div style="text-align: center"> <span class="marked">marked</span>

如果这是你想要匹配的,那就是子组合器。当然,你必须将b改为div。

If that is what you want to match, it's the child combinator again. Of course, you have to change the b then to a div.

这篇关于PHP简单的HTML DOM解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆