需要有关PHP DOM XPath解析表的帮助 [英] Need help with PHP DOM XPath parsing table
问题描述
我最近刚刚阅读了有关PHP中的DOM模块的信息,现在我正试图将其用于解析HTML文档.该页面说,这是比使用preg更好的解决方案,但是我很难弄清楚如何使用它.
I just recently read about the DOM module in PHP and now I'm trying to use it for parsing a HTML document. The page said that this was a much better solution than using preg but I'm having a hard time figuring out how to use it.
该页面包含一个表,其中包含日期和该日期的X个事件.
The page contains a table with dates and X number of events for the date.
首先,我需要从带有tr的valign ="bottom"中获取文本(日期),然后我需要从位于tr之下的带有valign ="top"的所有tr中获取所有列值.我需要从tr下方的每个tr到日期的所有列值,直到具有valign ="bottom"(下一个日期)的下一个tr为止.带有列数据的tr的数量未知,可以为零或很多.
First I need to get the text (a date) from a tr with valign="bottom" and then I need to get all the column values from all the tr with valign="top" who is below that tr. I need all the column values from each tr below the tr with the date up until the next tr with valign="bottom" (next date). The number of tr with column data is unknown, can be zero or a lot of them.
这是页面上的HTML外观:
This is what the HTML on the page looks like:
<table>
<tr valign="bottom">
<td colspan="4">2009-02-26</td>
</tr>
<tr valign="top">
<td>21:00</td>
<td>Column data</td>
<td>Column data</td>
<td>Column data</td>
</tr>
<tr valign="top">
<td>23:00</td>
<td>Column data</td>
<td>Column data</td>
<td>Column data</td>
</tr>
<tr valign="bottom">
<td colspan="4">2009-02-27</td>
</tr>
<tr valign="top">
<td>06:00</td>
<td>Column data</td>
<td>Column data</td>
<td>Column data</td>
</tr>
<tr valign="top">
<td>10:00</td>
<td>Column data</td>
<td>Column data</td>
<td>Column data</td>
</tr>
<tr valign="top">
<td>13:00</td>
<td>Column data</td>
<td>Column data</td>
<td>Column data</td>
</tr>
</table>
到目前为止,我已经能够获得前两个日期(我只对前两个日期感兴趣),但是我不知道该怎么去.
So far I've been able to get the first two dates (I'm only interested in the first two) but I don't know how to go from here.
我用来获取日期trs的xpath查询是
The xpath query I use to get the date trs is
$result = $xpath->query('//tr[@valign="bottom"][position()<3]);
现在,我需要一种方法将当天的所有事件与日期关联起来,即选择直到下一个日期tr的所有tds和所有列值.
Now I need a way to connect all the events for that day to the date, ie. select all the tds and all the column values up until the next date tr.
推荐答案
$oldSetting = libxml_use_internal_errors( true );
libxml_clear_errors();
$html = new DOMDocument();
$html->loadHtmlFile('http://url/table.html');
$xpath = new DOMXPath( $html );
$elements = $xpath->query( "//table/tr" );
foreach ( $elements as $item ) {
$newDom = new DOMDocument;
$newDom->appendChild($newDom->importNode($item,true));
$xpath = new DOMXPath( $newDom );
foreach ($item->attributes as $attribute) {
for ($node = $item->firstChild; $node !== NULL;
$node = $node->nextSibling) {
if (($attribute->nodeName =='valign') && ($attribute->nodeValue=='top'))
{
print($node->nodeValue);
}
else
{
print("<br>".$node->nodeValue);
}
}
print("<br>");
}
}
libxml_clear_errors();
libxml_use_internal_errors( $oldSetting );
这篇关于需要有关PHP DOM XPath解析表的帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!