将 ZipFile 从 URL 读入 StringIO 并用 panda.read_csv 解析 [英] Read ZipFile from URL into StringIO and parse with panda.read_csv

查看:43
本文介绍了将 ZipFile 从 URL 读入 StringIO 并用 panda.read_csv 解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从 URL 读取 ZipFile 数据,并通过 StringIO 使用 ZipFile 中的数据解析为 csvpandas.read_csv

I'm trying to read ZipFile data from a URL and via StringIO parse the data inside the ZipFile as csv using pandas.read_csv

r = req.get("http://seanlahman.com/files/database/lahman-csv_2014-02-14.zip").content
file = ZipFile(StringIO(r))
salaries_csv = file.open("Salaries.csv")
salaries = pd.read_csv(salaries_csv)

最后一行给了我一个错误:

The last line gave me an error:

CParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.

但是,如果我尝试使用

However if i try using

salaries = pd.read_csv(file.open("Salaries.csv"))

它有效.

所以我想知道我在这里错过了什么.

So I was wondering what am I missing out here.

file.open 应该返回一个 ZipExtFile 对象,并且由于 read_csv 只接受字符串或文件句柄/StringIO 输入,为什么最后一行是然后工作?

file.open should return a ZipExtFile object and since read_csv takes only string or file handle / StringIO input, why is the last line working then?

推荐答案

我认为您读取数据的方式有问题,它对我使用 urllib2 有效.

I think something is wrong with the way you read the data, it works for me using urllib2.

from zipfile import ZipFile
from StringIO import StringIO
import urllib2

r = urllib2.urlopen("http://seanlahman.com/files/database/lahman-csv_2014-02-14.zip").read()
file = ZipFile(StringIO(r))
salaries_csv = file.open("Salaries.csv")
salaries = pd.read_csv(salaries_csv)
       yearID teamID lgID   playerID    salary
0        1985    BAL   AL  murraed02   1472819
1        1985    BAL   AL   lynnfr01   1090000
2        1985    BAL   AL  ripkeca01    800000
3        1985    BAL   AL   lacyle01    725000
4        1985    BAL   AL  flanami01    641667
5        1985    BAL   AL  boddimi01    625000
6        1985    BAL   AL  stewasa01    581250
7        1985    BAL   AL  martide01    560000

这篇关于将 ZipFile 从 URL 读入 StringIO 并用 panda.read_csv 解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆