美丽的汤和表 [英] Beautiful Soup and Tables

查看：153 发布时间：2016/8/5 19:07:16 python beautifulsoup html-table

本文介绍了美丽的汤和表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

你好我想用美丽的汤解析HTML表格。
该表看起来是这样的：

Hi I'm trying to parse an html table using Beautiful Soup. The table looks something like this:

<table width=100% border=1 cellpadding=0 cellspacing=0 bgcolor=#e0e0cc>
 <tr>
  <td width=12% height=1 align=center valign=middle  bgcolor=#e0e0cc bordercolorlight=#000000 bordercolordark=white> <b><font face="Verdana" size=1><a href="http://www.dailystocks.com/" alt="DailyStocks.com" title="Home">Home</a></font></b></td>
 </tr>
</table>
<table width="100%" border="0" cellpadding="1" cellspacing="1">
  <tr class="odd"><td class="left"><a href="whatever">ABX</a></td><td class="left">Barrick Gold Corp.</td><td>55.95</td><td>55.18</td><td class="up">+0.70</td><td>11040601</td><td>70.28%</td><td><center>&nbsp;<a href="whatever" class="bcQLink">&nbsp;Q&nbsp;</a>&nbsp;<a href="chart.asp?sym=ABX&code=XDAILY" class="bcQLink">&nbsp;C&nbsp;</a>&nbsp;<a href="texpert.asp?sym=ABX&code=XDAILY" class="bcQLink">&nbsp;O&nbsp;</a>&nbsp;</center></td></tr>
 </table>

我想获得第二个表中的信息，到目前为止，我想这code：

I would like to get the information from the second table, and so far I tried this code:

html = file("whatever.html")
soup = BeautifulSoup(html)
t = soup.find(id='table')
dat = [ map(str, row.findAll("td")) for row in t.findAll("tr") ]

这似乎没有工作，任何帮助将非常AP preciated，
谢谢

That doesnt seem to work, any help would be much appreciated, Thanks

推荐答案

第一个问题是有这样的说法：T = soup.find（ID ='表'）有什么用表的ID。我想你的意思是T = soup.find（'表'）这一发现一个表。不幸的是，只有找到的第一个的表。

The first problem is with this statement: "t=soup.find(id='table')" There is nothing with an id of table. I think what you mean is "t=soup.find('table')" this finds a table. Unfortunately it only finds the first table.

您可以做T = soup.findAll（表）[1]，但是这将是相当脆弱。

You could do "t=soup.findAll(table)[1]" but this would be quite brittle.

我建议类似如下：

html = file("whatever.html")
soup = BeautifulSoup(html)
rows = soup.findAll("tr", {'class': ['odd', 'even']})
dat = []
for row in rows:
  dat.append( map( str, row.findAll('td') )

由此产生的DAT变量是：

The resulting dat variable is:

[['<td class="left"><a href="whatever">ABX</a></td>', '<td class="left">Barrick Gold Corp.</td>', '<td>55.95</td>', '<td>55.18</td>', '<td class="up">+0.70</td>', '<td>11040601</td>', '<td>70.28%</td>', '<td><center>&nbsp;<a href="whatever" class="bcQLink">&nbsp;Q&nbsp;</a>&nbsp;<a href="chart.asp?sym=ABX&amp;code=XDAILY" class="bcQLink">&nbsp;C&nbsp;</a>&nbsp;<a href="texpert.asp?sym=ABX&amp;code=XDAILY" class="bcQLink">&nbsp;O&nbsp;</a>&nbsp;</center></td>']]

编辑：错误的数组索引

wrong array index

这篇关于美丽的汤和表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

美丽的汤和表 [英] Beautiful Soup and Tables

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

美丽的汤和表 [英] Beautiful Soup and Tables

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭