如何以gzip压缩格式直接保存 pandas 数据框? [英] How to save a pandas dataframe in gzipped format directly?

查看:60
本文介绍了如何以gzip压缩格式直接保存 pandas 数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名为df的熊猫数据框.

I have a pandas data frame, called df.

我想将其保存为压缩格式.一种方法如下:

I want to save this in a gzipped format. One way to do this is the following:

import gzip
import pandas

df.save('filename.pickle')
f_in = open('filename.pickle', 'rb')
f_out = gzip.open('filename.pickle.gz', 'wb')
f_out.writelines(f_in)
f_in.close()
f_out.close()

但是,这需要我首先创建一个名为filename.pickle的文件. 有没有一种方法可以更直接地执行此操作,即不创建filename.pickle?

However, this requires me to first create a file called filename.pickle. Is there a way to do this more directly, i.e., without creating the filename.pickle?

当我要加载已压缩的数据帧时,我必须经历相同的操作 创建filename.pickle的步骤.例如,读取文件 filename2.pickle.gzip,这是一个压缩的熊猫数据框,我知道以下方法:

When I want to load the dataframe that has been gzipped I have to go through the same step of creating filename.pickle. For example, to read a file filename2.pickle.gzip, which is a gzipped pandas dataframe, I know of the following method:

f_in = gzip.open('filename2.pickle.gz', 'rb')
f_out = gzip.open('filename2.pickle', 'wb')
f_out.writelines(f_in)
f_in.close()
f_out.close()

df2 = pandas.load('filename2.pickle')

可以不先创建filename2.pickle来完成此操作吗?

Can this be done without creating filename2.pickle first?

推荐答案

我们计划最终通过压缩添加更好的序列化.密切关注熊猫的发展

We plan to add better serialization with compression eventually. Stay tuned to pandas development

这篇关于如何以gzip压缩格式直接保存 pandas 数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆