读取csv文件的一部分 [英] Reading a part of csv file

查看:76
本文介绍了读取csv文件的一部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常大的csv文件,大约10GB.每当我尝试使用

I have a really large csv file about 10GB. When ever I try to read in into iPython notebook using

data = pd.read_csv("data.csv")  

我的笔记本电脑卡住了.是否可以仅读取10,000行或500 MB的csv文件.

my laptop gets stuck. Is it possible to just read like 10,000 rows or 500 MB of a csv file.

推荐答案

有可能.您可以通过将 iterator = True 和所需的 chunksize 传递给

It is possible. You can create an iterator yielding chunks of your csv of a certain size at a time as a DataFrame by passing iterator=True with your desired chunksize to read_csv.

df_iter = pd.read_csv('data.csv', chunksize=10000, iterator=True)

for iter_num, chunk in enumerate(df_iter, 1):
    print(f'Processing iteration {iter_num}')
    # do things with chunk

或更简短地

for chunk in pd.read_csv('data.csv', chunksize=10000):
    # do things with chunk

或者,如果您只想读取csv的特定部分,则可以使用 skiprows nrows 选项从特定行开始,然后阅读顾名思义, n 行.

Alternatively if there was just a specific part of the csv you wanted to read, you could use the skiprows and nrows options to start at a particular line and subsequently read n rows, as the naming suggests.

这篇关于读取csv文件的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆