将 pandas 数据框保存到pickle和csv之间有什么区别? [英] What is the difference between save a pandas dataframe to pickle and to csv?

查看:414
本文介绍了将 pandas 数据框保存到pickle和csv之间有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习python熊猫. 我看到一个教程,其中显示了两种保存熊猫数据框的方法.

I am learning python pandas. I see a tutorial which shows two ways to save a pandas dataframe.

  1. pd.to_csv('sub.csv')并打开pd.read_csv('sub.csv')

pd.to_pickle('sub.pkl')并打开pd.read_pickle('sub.pkl')

本教程说,to_pickle是将数据帧保存到磁盘.我对此感到困惑.因为当我使用to_csv时,确实看到文件夹中出现了一个csv文件,我认为它也保存在磁盘上了吗?

The tutorial says to_pickle is to save the dataframe to disk. I am confused about this. Because when I use to_csv, I did see a csv file appears in the folder, which I assume is also save to disk right?

通常,为什么我们要使用to_pickle保存数据框,而不是将其保存为csv或txt或其他格式?

In general, why we want to save a dataframe using to_pickle rather than save it to csv or txt or other format?

推荐答案

Pickle是一种存储Pandas数据帧的序列化方法.您基本上是在将数据帧的确切表示形式写到光盘上.这意味着列的类型是相同的,索引是相同的.如果您只是将文件另存为csv,则只是将其存储为以逗号分隔的列表.视您的数据集而定,将其备份时会丢失一些信息.

Pickle is a serialized way of storing a Pandas dataframe. You are basically writing down the exact representation of your dataframe to disc. This means the types of the columns are the same and the index is the same. If you simply save a file as a csv you are just storing it as a comma separated list. Depending on your data set, some information will be lost when you load it back up.

https://docs.python.org/3/library/pickle.html

这篇关于将 pandas 数据框保存到pickle和csv之间有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆