URL到Excel工作簿的表格到`pandas.DataFrame`中 [英] sheets of Excel Workbook from a URL into a `pandas.DataFrame`

查看:156
本文介绍了URL到Excel工作簿的表格到`pandas.DataFrame`中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在研究了读取URL链接的不同方法之后,指向一个.xls文件,我决定继续使用xlrd.

After looking at different ways to read an url link, pointing to a .xls file, I decided to go with using xlrd.

我很难将'xlrd.book.Book'类型转换为'pandas.DataFrame'

I am having a difficult time converting a 'xlrd.book.Book' type to a 'pandas.DataFrame'

我有以下内容:

import pandas
import xlrd 
import urllib2

link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls'
socket = urllib2.urlopen(link)

#this line gets me the excel workbook 
xlfile = xlrd.open_workbook(file_contents = socket.read())

#storing the sheets
sheets = xlfile.sheets()

我想敲定sheets的最后一页并将其作为pandas.DataFrame导入,关于如何完成此操作的任何想法?我已经试过了,pandas.ExcelFile.parse(),但是它想要一个excel文件的路径.我当然可以将文件保存到内存中,然后解析(使用tempfile或类似方法),但是我试图遵循pythonic准则并使用已经写入熊猫的功能 .

I want to tak the last sheet of sheets and import as a pandas.DataFrame, any ideas as to how I can accomplish this? I've tried, pandas.ExcelFile.parse() but it wants a path to an excel file. I can of certainly save the file to memory and then parse (using tempfile or something), but I'm trying to follow pythonic guidelines and use functionality likely already written into pandas.

任何指导一如既往地受到赞赏.

Any guidance is greatly appreciated as always.

推荐答案

您可以将socket传递给ExcelFile:

>>> import pandas as pd
>>> import urllib2
>>> link = 'http://www.econ.yale.edu/~shiller/data/chapt26.xls'
>>> socket = urllib2.urlopen(link)
>>> xd = pd.ExcelFile(socket)
NOTE *** Ignoring non-worksheet data named u'PDVPlot' (type 0x02 = Chart)
NOTE *** Ignoring non-worksheet data named u'ConsumptionPlot' (type 0x02 = Chart)
>>> xd.sheet_names
[u'Data', u'Consumption', u'Calculations']
>>> df = xd.parse(xd.sheet_names[-1], header=None)
>>> df
                                   0   1   2   3         4
0        Average Real Interest Rate: NaN NaN NaN  1.028826
1    Geometric Average Stock Return: NaN NaN NaN  0.065533
2              exp(geo. Avg. return) NaN NaN NaN  0.067728
3  Geometric Average Dividend Growth NaN NaN NaN  0.012025

这篇关于URL到Excel工作簿的表格到`pandas.DataFrame`中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆