读取压缩文件作为Pandas DataFrame [英] Read a zipped file as a pandas DataFrame

查看:229
本文介绍了读取压缩文件作为Pandas DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解压缩一个csv文件并将其传递给熊猫,以便我可以处理该文件.
到目前为止,我尝试过的代码是:

I'm trying to unzip a csv file and pass it into pandas so I can work on the file.
The code I have tried so far is:

import requests, zipfile, StringIO
r = requests.get('http://data.octo.dc.gov/feeds/crime_incidents/archive/crime_incidents_2013_CSV.zip')
z = zipfile.ZipFile(StringIO.StringIO(r.content))
crime2013 = pandas.read_csv(z.read('crime_incidents_2013_CSV.csv'))

在最后一行之后,尽管python能够获取文件,但在错误结尾处出现不存在".

有人可以告诉我我做错了什么吗?

Can someone tell me what I'm doing incorrectly?

推荐答案

如果要将压缩文件或tar.gz文件读入pandas数据帧,则read_csv方法包括此特定实现.

If you want to read a zipped or a tar.gz file into pandas dataframe, the read_csv methods includes this particular implementation.

df = pd.read_csv('filename.zip')

或长格式:

df = pd.read_csv('filename.zip', compression='zip', header=0, sep=',', quotechar='"')

docs 中的压缩参数说明:

压缩:{'推断','gzip','bz2','zip','xz',无},默认为'推断' 用于对磁盘数据进行即时解压缩.如果推断"和filepath_or_buffer类似于路径,请从以下扩展名检测压缩:.gz",.bz2",.zip"或".xz"(否则不进行解压缩).如果使用"zip",则ZIP文件必须仅包含一个要读取的数据文件.设置为无"将不进行解压缩.

compression : {‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’, None}, default ‘infer’ For on-the-fly decompression of on-disk data. If ‘infer’ and filepath_or_buffer is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, or ‘.xz’ (otherwise no decompression). If using ‘zip’, the ZIP file must contain only one data file to be read in. Set to None for no decompression.

0.18.1版中的新功能:支持"zip"和"xz"压缩.

New in version 0.18.1: support for ‘zip’ and ‘xz’ compression.

这篇关于读取压缩文件作为Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆