Beautifulsoup结果转换为Pandas数据框 [英] Beautifulsoup results to pandas dataframe

查看:147
本文介绍了Beautifulsoup结果转换为Pandas数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的代码向我返回table并显示以下结果

The below code returns me a table with the following results

r = requests.get(url)
soup = bs4.BeautifulSoup(r.text, 'lxml')

mylist = soup.find(attrs={'class': 'table_grey_border'})
print(mylist)

结果-可以持续显示1700行

<table cellpadding="0" cellspacing="2" class="table_grey_border" width="100%">
<tr valign="top">
<td class="verd_black12" width="18%"><b>STOCK CODE</b></td>
<td class="verd_black12" width="42%"><b>NAME OF LISTED SECURITIES</b></td>
<td class="verd_black12" width="19%"><b>BOARD LOT</b></td>
<td class="verd_black12" colspan="4" width="12%"><b>REMARK</b></td>
</tr>
<tr class="tr_normal">
<td class="verd_black12" width="18%">00001</td>
<td class="verd_black12" width="42%"><a href="../../../invest/company/profile_page_e.asp?WidCoID=00001&amp;WidCoAbbName=&amp;Month=&amp;langcode=e" target="_parent">CKH HOLDINGS</a></td>
<td class="verd_black12" width="19%">500</td>
<td align="center" class="verd_black12" width="3%">#</td>
<td align="center" class="verd_black12" width="3%">H</td>
<td align="center" class="verd_black12" width="3%">O</td>
<td align="center" class="verd_black12" width="3%">F</td>
</tr>
<tr class="tr_normal">
<td class="verd_black12" width="18%">00002</td>
<td class="verd_black12" width="42%"><a href="../../../invest/company/profile_page_e.asp?WidCoID=00002&amp;WidCoAbbName=&amp;Month=&amp;langcode=e" target="_parent">CLP HOLDINGS</a></td>
<td class="verd_black12" width="19%">500</td>
<td align="center" class="verd_black12" width="3%">#</td>
<td align="center" class="verd_black12" width="3%">H</td>
<td align="center" class="verd_black12" width="3%">O</td>
<td align="center" class="verd_black12" width="3%">F</td>
</tr>
...

我的问题是,如何将这些行中的每行放入Pandas Dataframe中?我尝试了以下代码,但返回错误

My question is, how do I put each of these rows into Pandas Dataframe? I tried the below code, but i'm returned with an error

a = pandas.read_html(mylist)
print(a)

错误

TypeError: 'NoneType' object is not callable

推荐答案

文档:

pandas.read_html(url, attrs={'class': 'table_grey_border'})

这篇关于Beautifulsoup结果转换为Pandas数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆