需要有关PHP DOM XPath解析表的帮助 [英] Need help with PHP DOM XPath parsing table

查看:78
本文介绍了需要有关PHP DOM XPath解析表的帮助的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近刚刚阅读了有关PHP中的DOM模块的信息,现在我正试图将其用于解析HTML文档.该页面说,这是比使用preg更好的解决方案,但是我很难弄清楚如何使用它.

I just recently read about the DOM module in PHP and now I'm trying to use it for parsing a HTML document. The page said that this was a much better solution than using preg but I'm having a hard time figuring out how to use it.

该页面包含一个表,其中包含日期和该日期的X个事件.

The page contains a table with dates and X number of events for the date.

首先,我需要从带有tr的valign ="bottom"中获取文本(日期),然后我需要从位于tr之下的带有valign ="top"的所有tr中获取所有列值.我需要从tr下方的每个tr到日期的所有列值,直到具有valign ="bottom"(下一个日期)的下一个tr为止.带有列数据的tr的数量未知,可以为零或很多.

First I need to get the text (a date) from a tr with valign="bottom" and then I need to get all the column values from all the tr with valign="top" who is below that tr. I need all the column values from each tr below the tr with the date up until the next tr with valign="bottom" (next date). The number of tr with column data is unknown, can be zero or a lot of them.

这是页面上的HTML外观:

This is what the HTML on the page looks like:

<table>
    <tr valign="bottom">
        <td colspan="4">2009-02-26</td>
    </tr>
    <tr valign="top">
        <td>21:00</td>
        <td>Column data</td>
        <td>Column data</td>
        <td>Column data</td>
    </tr>
    <tr valign="top">
        <td>23:00</td>
        <td>Column data</td>
        <td>Column data</td>
        <td>Column data</td>
    </tr>
    <tr valign="bottom">
        <td colspan="4">2009-02-27</td>
    </tr>
    <tr valign="top">
        <td>06:00</td>
        <td>Column data</td>
        <td>Column data</td>
        <td>Column data</td>
    </tr>
    <tr valign="top">
        <td>10:00</td>
        <td>Column data</td>
        <td>Column data</td>
        <td>Column data</td>
    </tr>
    <tr valign="top">
        <td>13:00</td>
        <td>Column data</td>
        <td>Column data</td>
        <td>Column data</td>
    </tr>
</table>

到目前为止,我已经能够获得前两个日期(我只对前两个日期感兴趣),但是我不知道该怎么去.

So far I've been able to get the first two dates (I'm only interested in the first two) but I don't know how to go from here.

我用来获取日期trs的xpath查询是

The xpath query I use to get the date trs is

$result = $xpath->query('//tr[@valign="bottom"][position()<3]);

现在,我需要一种方法将当天的所有事件与日期关联起来,即选择直到下一个日期tr的所有tds和所有列值.

Now I need a way to connect all the events for that day to the date, ie. select all the tds and all the column values up until the next date tr.

推荐答案

$oldSetting = libxml_use_internal_errors( true ); 
libxml_clear_errors(); 

$html = new DOMDocument(); 
$html->loadHtmlFile('http://url/table.html'); 

$xpath = new DOMXPath( $html ); 
$elements = $xpath->query( "//table/tr" ); 

foreach ( $elements as $item ) {
  $newDom = new DOMDocument;
  $newDom->appendChild($newDom->importNode($item,true));

  $xpath = new DOMXPath( $newDom ); 

  foreach ($item->attributes as $attribute) { 

    for ($node = $item->firstChild; $node !== NULL; 
         $node = $node->nextSibling) {
      if (($attribute->nodeName =='valign') && ($attribute->nodeValue=='top'))
      {
        print($node->nodeValue); 
      }
      else
      {
        print("<br>".$node->nodeValue);
      }
    }
    print("<br>");
  } 
}

libxml_clear_errors(); 
libxml_use_internal_errors( $oldSetting ); 

这篇关于需要有关PHP DOM XPath解析表的帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆