如何在读取大的csv文件时解决 pandas 的内存问题 [英] How to resolve memory issue of pandas while reading big csv files

查看：322 发布时间：2020/5/24 2:04:45 python csv pandas dataframe iterator

本文介绍了如何在读取大的csv文件时解决 pandas 的内存问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个100GB的csv文件，其中包含数百万行.我需要一次在pandas数据框中读取10,000行，并将其分块写入SQL Server.

I have a 100GB csv file with millions of rows. I need to read, say, 10,000 rows at a time in pandas dataframe and write that to the SQL server in chunks.

我按照

I used chunksize as well as iteartor as suggested on http://pandas-docs.github.io/pandas-docs-travis/io.html#iterating-through-files-chunk-by-chunk, and have gone through many similar questions,but I am still getting the out of memory error.

您能建议一个代码来迭代读取pandas数据框中的很大的csv文件吗?

Can you suggest a code to read very big csv files in pandas dataframe iteratively?

推荐答案

演示:

for chunk in pd.read_csv(filename, chunksize=10**5):
    chunk.to_sql('table_name', conn, if_exists='append')

其中conn是一个SQLAlchemy引擎(由sqlalchemy.create_engine(...)创建)

where conn is a SQLAlchemy engine (created by sqlalchemy.create_engine(...))

这篇关于如何在读取大的csv文件时解决 pandas 的内存问题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在读取大的csv文件时解决 pandas 的内存问题 [英] How to resolve memory issue of pandas while reading big csv files

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在读取大的csv文件时解决 pandas 的内存问题 [英] How to resolve memory issue of pandas while reading big csv files

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭