使用第一行作为列名? pandas read_html [英] Use first row as column names? Pandas read_html

查看:201
本文介绍了使用第一行作为列名? pandas read_html的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的单行脚本:

I have this simple one line script:

from pandas import read_html

print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')

哪种方法可以,但是缺少列名称,它们被标识为1、2、3.是否有一种简单的方法告诉熊猫将第一行用作列名称?我知道我可以将名称存储为列表并进行设置,然后跳过第一行,但是我想知道是否有更简单/更好的方法.

Which works, fine, but the column names are missing, they are being identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know I could just store the names as a list and set them, and then skip the first row, but am wondering if there is an easier/better way.

当前它打印:

                           0       1       2         3
0                    Company   Price  Change  % Change
1             AAPL Apple Inc  115.31   +6.17    +5.65%
2   BAC Bank of America Corp   15.20   -0.43    -2.75%
3            YHOO Yahoo! Inc   46.46   -1.53    -3.19%
4        MSFT Microsoft Corp   41.19   -1.47    -3.45%
5            FB Facebook Inc   76.24   +0.46    +0.61%
6     GE General Electric Co   23.84   -0.54    -2.21%
7                 T AT&T Inc   32.68   -0.13    -0.40%
8            F Ford Motor Co   14.46   -0.24    -1.63%
9            INTC Intel Corp   33.78   -0.41    -1.20%
10    CSCO Cisco Systems Inc   26.80   -0.09    -0.35%

推荐答案

'read_html`采用标头参数.您可以传递行索引:

'read_html` takes a header parameter. You can pass a row index:

read_html('http://money.cnn.com/data/hotstocks/', header =0, flavor = 'bs4')

值得在文档中注意此警告:

Worth noting this caveat in the docs:

例如,如果在传递header = 0参数时将列名转换为NaN,则可能需要手动分配列名

For example, you might need to manually assign column names if the column names are converted to NaN when you pass the header=0 argument

http://pandas.pydata.org/pandas-docs /stable/generation/pandas.io.html.read_html.html

这篇关于使用第一行作为列名? pandas read_html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆