Python Beautiful Soup解析具有特定ID的表 [英] Python Beautiful Soup parse a table with a specific ID

查看:79
本文介绍了Python Beautiful Soup解析具有特定ID的表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从具有我所知道的特定ID的表中获取数据. 由于某种原因,代码不断给我无"结果.

I'm trying to get the data from a table with a specific ID which I know. For some reason, the code keeps giving me a None result.

根据我要解析的HTML代码:

From the HTML code I'm trying to parse:

<table cellspacing="0" cellpadding="3" border="0" id="ctl00_SPWebPartManager1_g_c001c0d9_0cb8_4b0f_b75a_7cc3b6f7d790_ctl00_HistoryData1_gridHistoryData_DataGrid1" style="width:100%;border-collapse:collapse;">
    <tr class="gridHeader" valign="top">
        <td class="titleGridRegNoB" align="center" valign="top"><span dir=RTL>שווי שוק (אלפי ש"ח)</span></td>
        <td class="titleGridReg" align="center" valign="top">הון רשום למסחר</td>
        <td class="titleGridReg" align="center" valign="top">שער נמוך</td><td class="titleGridReg" align="center" valign="top">שער גבוה</td>
        <td class="titleGridReg" align="center" valign="top">שער בסיס</td>
        <td class="titleGridReg" align="center" valign="top">שער פתיחה</td><td class="titleGridReg" align="center" valign="top"><span dir="rtl">שער נעילה (באגורות)</span></td>
        <td class="titleGridReg" align="center" valign="top">שער נעילה מתואם</td><td class="titleGridReg" align="center" valign="top">תאריך</td>
    </tr>
    <tr onmouseover="this.style.backgroundColor='#FDF1D7'" onmouseout="this.style.backgroundColor='#ffffff'">

...等等

我的代码:

html = br.response().read()
soup = BeautifulSoup(html)

table = soup.find(lambda tag: tag.name=='table' and tag.has_key('id') and tag['id']=="ctl00_SPWebPartManager1_g_c001c0d9_0cb8_4b0f_b75a_7cc3b6f7d790_ctl00_HistoryData1_gridHistoryData_DataGrid1")
rows = table.findAll(lambda tag: tag.name=='tr')

In [100]: print table
None

推荐答案

来自文档:

table = soup.find('table', id="ctl00_SPWebPartManager1_g_c001c0d9_0cb8_4b0f_b75a_7cc3b6f7d790_ctl00_HistoryData1_gridHistoryData_DataGrid1")

行行:

rows = table.findAll('tr')

对于编码问题,请尝试从utf-8解码,然后重新编码.

For the encoding problem, try decoding it from utf-8, and re-encode it.

html = br.response().read().decode('utf-8')
soup = BeautifulSoup(html.encode('utf-8'))

这篇关于Python Beautiful Soup解析具有特定ID的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆