使用第一行作为列名? pandas read_html [英] Use first row as column names? Pandas read_html
问题描述
我有一个简单的单行脚本:
I have this simple one line script:
from pandas import read_html
print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')
哪种方法可以,但是缺少列名称,它们被标识为1、2、3.是否有一种简单的方法告诉熊猫将第一行用作列名称?我知道我可以将名称存储为列表并进行设置,然后跳过第一行,但是我想知道是否有更简单/更好的方法.
Which works, fine, but the column names are missing, they are being identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know I could just store the names as a list and set them, and then skip the first row, but am wondering if there is an easier/better way.
当前它打印:
0 1 2 3
0 Company Price Change % Change
1 AAPL Apple Inc 115.31 +6.17 +5.65%
2 BAC Bank of America Corp 15.20 -0.43 -2.75%
3 YHOO Yahoo! Inc 46.46 -1.53 -3.19%
4 MSFT Microsoft Corp 41.19 -1.47 -3.45%
5 FB Facebook Inc 76.24 +0.46 +0.61%
6 GE General Electric Co 23.84 -0.54 -2.21%
7 T AT&T Inc 32.68 -0.13 -0.40%
8 F Ford Motor Co 14.46 -0.24 -1.63%
9 INTC Intel Corp 33.78 -0.41 -1.20%
10 CSCO Cisco Systems Inc 26.80 -0.09 -0.35%
推荐答案
'read_html`采用标头参数.您可以传递行索引:
'read_html` takes a header parameter. You can pass a row index:
read_html('http://money.cnn.com/data/hotstocks/', header =0, flavor = 'bs4')
值得在文档中注意此警告:
Worth noting this caveat in the docs:
例如,如果在传递header = 0参数时将列名转换为NaN,则可能需要手动分配列名
For example, you might need to manually assign column names if the column names are converted to NaN when you pass the header=0 argument
http://pandas.pydata.org/pandas-docs /stable/generation/pandas.io.html.read_html.html
这篇关于使用第一行作为列名? pandas read_html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!