将HTML表放入pandas数据框,而不是数据框对象列表 [英] Get HTML table into pandas Dataframe, not list of dataframe objects

查看:78
本文介绍了将HTML表放入pandas数据框,而不是数据框对象列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于在其他地方回答过此问题,我深表歉意,但在这里或其他地方未能找到满意的答案,我一直没有成功.

I apologize if this question has been answered elsewhere but I have been unsuccessful in finding a satisfactory answer here or elsewhere.

我对python和pandas有点陌生,并且在将HTML数据导入pandas数据框中时遇到一些困难.在pandas文档中,它说.read_html()返回一个数据框对象列表,因此当我尝试执行一些数据操作以摆脱一些示例时,我会得到一个错误.

I am somewhat new to python and pandas and having some difficulty getting HTML data into a pandas dataframe. In the pandas documentation it says .read_html() returns a list of dataframe objects, so when I try to do some data manipulation to get rid of the some samples I get an error.

这是我读取HTML的代码:

Here is my code to read the HTML:

df = pd.read_html('http://espn.go.com/nhl/statistics/player/_/stat/points/sort/points/year/2015/seasontype/2', header = 1)

然后我尝试清理它:

df = df.dropna(axis=0, thresh=4)

我收到以下错误:

Traceback (most recent call last): File "module4.py", line 25, in
<module> df = df.dropna(axis=0, thresh=4) AttributeError: 'list'
object has no attribute 'dropna'

如何将这些数据放入实际的数据框中,类似于.read_csv()一样?

How do I get this data into an actual dataframe, similar to what .read_csv() does?

推荐答案

来自

From http://pandas.pydata.org/pandas-docs/version/0.17.1/io.html#io-read-html, "read_html returns a list of DataFrame objects, even if there is only a single table contained in the HTML content".

所以df = df[0].dropna(axis=0, thresh=4)应该做你想做的事.

So df = df[0].dropna(axis=0, thresh=4) should do what you want.

这篇关于将HTML表放入pandas数据框,而不是数据框对象列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆