HTML分析 - 从一个div里面的表中获取数据？ [英] HTML Parsing - Get data from a table inside a div?

查看：116 发布时间：2018/6/19 21:08:55 jquery html json parsing screen-scraping

本文介绍了HTML分析 - 从一个div里面的表中获取数据？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于HTML解析/抓取的整个想法，我相对较新。我希望能来这里得到我需要的帮助！

基本上我期望做的（我认为）是指定页面的URL我希望从中获取数据。在这种情况下 - http://www.epgpweb.com/guild/us/Caelestrasz/ Crimson /

从那里，我想抓住div class = snapshot_table中的table class = listing。

然后我希望将该表嵌入到我自己的页面中，并在更新原始内容时更新它。

我已经阅读了其他一些在Google和Stackoverflow上的帖子，我也看了Nettuts +上的一个教程，但它似乎有点太多，一次采取。

希望有人在这里可以帮助我，并尽可能简单：）

干杯，

垫

- 编辑 -

当前代码截至上午11:22（GMT + 10）

<？php ＃不要忘记库 include（'simple_html_dom.php'）; ？> < html> < / head> < body> <？php $ html = file_get_html（'http://www.epgpweb.com/guild/us/Caelestrasz/Crimson/'）; $ table = $ html-> find（'＃snapshot_table table.listing'）; print_r（$ table）; ？> < / body> < / html>

解决方案
我想我已经开始工作了，很多！：）

<？php //获取当前时间戳 $ url ='http： //www.epgpweb.com/api/snapshot/us/Caelestrasz/Crimson'; $ url = file_get_contents（$ url）; $ url = substr（$ url，-12,10）; //根据时间戳获取成员资料 $ url ='http://www.epgpweb.com/api/snapshot/us/Caelestrasz/Crimson/'.$url ; $ url = file_get_contents（$ url）; //将unicode转换为html实体，就像我在这里找到的那样：http://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed- to-proper-utf-8-encoded-char 函数replace_unicode_escape_sequence（$ match）{ return mb_convert_encoding（pack（'H *'，$ match [1]），'UTF-8'，' UCS-2BE'）; } $ url = preg_replace_callback（'/ \\\\\u（[0-9a-f] {4}）/ i'，'replace_unicode_escape_sequence'，$ url）; //擦除/替换不重要的部分，将数据放入数组函数擦除（$ a）{ global $ url; $ url = explode（$ a，$ url）; $ url = implode（，$ url）; } 函数替换（$ a，$ b）{ global $ url; $ url = explode（$ a，$ url）; $ url = implode（$ b，$ url）; } replace（[[，;）; replace（]]，;）; replace（]，，;）; erase（'['）; erase（'''）; replace（：，，）; $ url = explode（;，$ url）; //丢失前端和结束位，并维护成员数据 array_shift（$ url）; array_pop（$ url）; //将数据放入array foreach（$ url as $ k => $ v）{ $ v = explode（，，$ v）; foreach（$ v as $ k2 => $ v2）{ $ data [$ k] [$ k2] = $ v2; } $ pr = round（intval（$ data [$ k] [1]）/ intval （$ data [$ k] [2]），3）; $ pr = str_pad（$ pr，5，0，STR_PAD_RIGHT）; $ pr = substr（$ pr，0， 5）; $ data [$ k] [3] = $ pr; } //按PR编号排序数组 function compare（$ x ，$ y） { if（$ x [3] == $ y [3]） return 0; else if（$ x [3]> $ y [3]） return -1; else return 1; } usort（$ data，'compare'）; //将数据输出到表中 echo< table>< tbody>< tr>< th>< th>< th>< th> GP< /第><的第i; PR< /第>< / TR& ; foreach（$ data as $ k => $ v）{ echo< tr>; foreach（$ v as $ v2）{ echo< td>。$ v2。< / td>; } echo< / tr>; } echo< / tbody>< / table>; ？>

I am relatively new to the whole idea for HTML parsing/scraping. I was hoping that I could come here to get the help that I need!

Basically what I am looking to do (i think), is specify the url of the page I wish to grab the data from. In this case - http://www.epgpweb.com/guild/us/Caelestrasz/Crimson/

From there, I want to grab the table class=listing in the div id=snapshot_table.

I then wish to embed that table onto my own page and have it update when the original content is updated.

I have read a few of the other posts on Google and Stackoverflow, I also had a look at a tutorial on Nettuts+ but it just seemed to be a bit too much to take in at once.

Hopefully someone here can help me out and make this as simple as possible :)

Cheers,

Mat

--Edit--

Current code as of 11:22am (GMT+10)
<?php # don't forget the library include('simple_html_dom.php'); ?> <html> </head> <body> <?php $html = file_get_html('http://www.epgpweb.com/guild/us/Caelestrasz/Crimson/'); $table = $html->find('#snapshot_table table.listing'); print_r($table); ?> </body> </html>

解决方案
I think I got it to work, and I learned a lot! :)
<?php //Get the current timestamp $url = 'http://www.epgpweb.com/api/snapshot/us/Caelestrasz/Crimson'; $url = file_get_contents($url); $url = substr($url,-12,10); //Get the member data based on the timestamp $url = 'http://www.epgpweb.com/api/snapshot/us/Caelestrasz/Crimson/'.$url; $url = file_get_contents($url); //Convert the unicode to html entities, as I found here: http://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-char function replace_unicode_escape_sequence($match) { return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE'); } $url = preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $url); //erase/replace the insignificant parts, to put the data into an array function erase($a){ global $url; $url = explode($a,$url); $url = implode("",$url); } function replace($a,$b){ global $url; $url = explode($a,$url); $url = implode($b,$url); } replace("[[",";"); replace("]]",";"); replace("],",";"); erase('['); erase('"'); replace(":",","); $url = explode(";", $url); //lose the front and end bits, and maintain the member data array_shift($url); array_pop($url); //put the data into an array foreach($url as $k=>$v){ $v = explode(",",$v); foreach($v as $k2=>$v2){ $data[$k][$k2] = $v2; } $pr = round(intval($data[$k][1]) / intval($data[$k][2]),3); $pr = str_pad($pr,5,"0",STR_PAD_RIGHT); $pr = substr($pr, 0, 5); $data[$k][3] = $pr; } //sort the array by PR number function compare($x, $y) { if ( $x[3] == $y[3] ) return 0; else if ( $x[3] > $y[3] ) return -1; else return 1; } usort($data, 'compare'); //output the data into a table echo "<table><tbody><tr><th>Member</th><th>EP</th><th>GP</th><th>PR</th></tr>"; foreach($data as $k=>$v){ echo "<tr>"; foreach($v as $v2){ echo "<td>".$v2."</td>"; } echo "</tr>"; } echo "</tbody></table>"; ?>

这篇关于HTML分析 - 从一个div里面的表中获取数据？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

HTML分析 - 从一个div里面的表中获取数据？ [英] HTML Parsing - Get data from a table inside a div?

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

HTML分析 - 从一个div里面的表中获取数据？ [英] HTML Parsing - Get data from a table inside a div?

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭