PHP简单的HTML DOM解析器 [英] PHP Simple HTML DOM Parser
问题描述
我刚开始使用 PHP Simple HTML DOM Parser 。
现在我正在尝试提取包含< b>
-tag的所有元素,包含< / b>
从一个exsiting HTML文档。这样可以正常工作。
Now I'm trying to extract all elements surrounded with a <b>
-tag inclduing </b>
from an exsiting HTML document. This works fine with
foreach($html->find('b') as $q)
echo $q;
如何才能实现仅显示由< b> ;
,< / b>
-tags后跟一个< span class =marked> code>?
How can I achieve to show up only elements surrounded with the <b>
,</b>
-tags followed by a <span class="marked">
?
更新:
我已经使用firebug获取元素的css路径。现在看起来像这样:
Update: I've used firebug to get the css path for the elements. Now it looks like this:
foreach ($html->find('html body div#wrapper table.desc tbody tr td div span.marked') as $x)
foreach ($x->find('html body div#wrapper table.desc tbody tr td table.split tbody tr td b') as $d)
echo $d;
但它不会工作...任何想法?
But it won't work... Any Ideas?
更新:
为了在这里澄清我的问题,请使用起始表和结束表标签的文档示例。
To clarify my question here a sample tr of the document with starting table and ending table tags.
<table width="100%" border="0" cellspacing="0" cellpadding="0" class="desc">
<tr>
<th width="25%" scope="col"><div align="center">1</div></th>
<th width="50" scope="col"><div align="center">2</div></th>
<th width="10%" scope="col"><div align="center">3</div></th>
<th width="15%" scope="col"><div align="center">4</div></th>
</tr>
<tr>
<td valign="top" bgcolor="#E9E9E9"><div style="text-align: center; font-weight: bold; margin-top: 2px"> 1 </div></td>
<td>
<table width="100%" border="0" cellspacing="0" cellpadding="0" class="split"> <tr>
<td>
<b> element to extract</b></td>
</tr>
<tr>
<td>
<table width="100%" border="0" cellspacing="0" cellpadding="0" class="split"> <tr>
<td width="15px" valign="top"> </td>
<td width="15px" valign="top">
<div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
1
</div>
</td>
<td>
abed
</td>
</tr>
<tr>
<td width="15px" valign="top"> </td>
<td width="15px" valign="top">
<div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
2
</div>
</td>
<td>
ddee
</td>
</tr>
<tr>
<td width="15px" valign="top"> </td>
<td width="15px" valign="top">
<div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
3
</div>
</td>
<td>
xdef
</td>
</tr>
<tr>
<td width="15px" valign="top"> </td>
<td width="15px" valign="top">
<div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
4
</div>
</td>
<td>
abbcc
</td>
</tr>
<tr>
<td width="15px" valign="top"> </td>
<td width="15px" valign="top">
<div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
5
</div>
</td>
<td>
ab
</td>
</tr>
<tr>
<td width="15px" valign="top"> </td>
<td width="15px" valign="top">
<div style="background-color:green ;color:#FFFFFF; text-align:center;padding-bottom: 1px">
6
</div>
</td>
<td>
e1
</td>
</tr>
</table>
</td>
</tr>
</table>
</td>
<td valign="top"><div style="text-align: center"> <span class="marked">marked</span> </div></td>
<td valign="top"><div style="text-align: center"> </div></td>
</tr>
</table>
推荐答案
尝试以下 CSS选择器
b > span.marked
这将返回跨度,所以你可能要做 $ e-> parent()
以获取b元素。
That would return the span though, so you probably have to do $e->parent()
to get to the b element.
另请参阅解析HTML的最佳方法 for SimpleHtmlDom的替代方法
Also see Best Methods to parse HTML for alternatives to SimpleHtmlDom
更新后修改:
您的浏览器将修改DOM 。 如果您查看标记,您将看到没有tbody元素。但是Firebug给你
Your browser will modify the DOM. If you look at your markup, you will see that there is no tbody elements. Yet Firebug gives you
html body div#wrapper table.desc tbody tr td div span.marked'
html body div#wrapper table.desc tbody tr td table.split tbody tr td b'
另外,你的问题不符合查询。你问如何找到< b>,< / b> $包围的
Also, your question does not match the queries. You asked how to find
c $ c> -tags后跟一个
< span class =marked>
可以读取,意思是
<b><span class="marked">foo</span></b>
或
<b><element>foo</element></b><span class="marked">foo</span>
For that first use the child combinator I have shown earlier. For the second, use the adjacent sibling combinator
b + span.marked
获取跨度,然后使用 $ e-> prev_sibling()
返回元素的上一个同级元素(如果未找到则为null)。
to get the span and then use $e->prev_sibling()
to return the previous sibling of element (or null if not found).
但是,在您显示的标记中,既没有也没有。只有DIV与SPAN孩子有标记的类
However, in your shown markup, there is neither nor. There is only a DIV with a SPAN child having the marked class
<div style="text-align: center"> <span class="marked">marked</span>
如果这是你想要匹配的,那就是子组合器。当然,你必须将b改为div。
If that is what you want to match, it's the child combinator again. Of course, you have to change the b then to a div.
这篇关于PHP简单的HTML DOM解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!