表中的Xpath表 [英] Xpath Table Within Table

查看:90
本文介绍了表中的Xpath表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在用DOMXpath刮擦大量表格的页面时遇到了一个问题。


布局确实很丑陋,这意味着我试图从表格中获取内容表格中的表格。
使用Firebug FirePath我正在为table元素获取以下路径:

  html / body / table / tbody / tr [3] / td / table [1] / tbody / tr [2] / td [1] / table [1] / tbody / tr [3] / td [4] 
所以我的问题是如何最好地从表格中的表格中获取表格中的内容?


我上传了要在此处抓取的文件: 1

解决方案

我遇到了与您同样的问题,报废了一个复杂且格式不正确的html源,我想在其中获取另一个表中一个表中的值。



我采用了这样的方法来关注我想要的零件,例如:

  function parse_html(){//得到我选择提取内容的表的特定部分
$ query = $ xpath-> query('// tr [@ data-eventid] / @ data-eventid') ; //获取我想要的表
$ this-> parse_table();
}
函数parse_table(){//
$ query = $ xpath-> query('// tr [@ data-eventid = 405412] / td [@ class = 影响] / span [@title] / @ title'); ... etc //提取表
$ this-> parseEvaluate();的内容。
}
函数parseEvaluate(){
...验证值是否正确
}

只给出想法。


I am having a bit of a problem of scraping a table-heavy page with DOMXpath.

The layout is really ugly, meaning I am trying to get content out of a table within a table within a table. Using Firebug FirePath I am getting for the table element the following path:

 html/body/table/tbody/tr[3]/td/table[1]/tbody/tr[2]/td[1]/table[1]/tbody/tr[3]/td[4]

Now, after endless experimenting I found out, that with a stand alone table, I need to remove the "tbody" tag to make it work. But this doesn't seem to be enough for tables within tables. So my question is how do I best get content out of tables within tables within tables?

I uploaded the file which I am trying to scrape here:1

解决方案

i have gone through with the same problem as yours scrapping a source of complicated and not well formatted html where i want to get the values in a table inside another tables..

i came with the approach of eyeing the part that i want to get with some series of function like this:

function parse_html() {//gets a specific part of the table i chose to extract the contents
    $query = $xpath->query('//tr[@data-eventid]/@data-eventid'); //gets the table i want
    $this->parse_table();
}
function parse_table() {//
    $query = $xpath->query('//tr[@data-eventid="405412"]/td[@class="impact"]/span[@title]/@title');...etc//extracts the content of the table
    $this->parseEvaluate();
} 
function parseEvaluate(){
    ...verifying values if correct
}

just giving the idea..

这篇关于表中的Xpath表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆