将大 pandas 数据帧分块写入CSV文件 [英] Writing large Pandas Dataframes to CSV file in chunks

查看：80 发布时间：2020/4/29 3:22:54 python pandas dataframe export-to-csv large-data

本文介绍了将大 pandas 数据帧分块写入CSV文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何将大数据文件分块写入CSV文件?

How do I write out a large data file to a CSV file in chunks?

我有一组大型数据文件(1M行x 20列).但是，我只关注该数据的5列左右.

I have a set of large data files (1M rows x 20 cols). However, only 5 or so columns of that data is of interest to me.

我想通过只用感兴趣的列制作这些文件的副本来使事情变得更容易，所以我可以使用较小的文件进行后期处理.因此，我计划将文件读取到数据帧中，然后再写入csv文件.

I want to make things easier by making copies of these files with only the columns of interest so I have smaller files to work with for post-processing. So I plan to read the file into a dataframe, then write to csv file.

我一直在研究将大数据文件以块的形式读取到数据帧中.但是，关于如何将数据分块写入csv文件方面，我一无所获.

I've been looking into reading large data files in chunks into a dataframe. However, I haven't been able to find anything on how to write out the data to a csv file in chunks.

这是我现在正在尝试的操作，但这不会附加csv文件:

Here is what I'm trying now, but this doesn't append the csv file:

with open(os.path.join(folder, filename), 'r') as src:
    df = pd.read_csv(src, sep='\t',skiprows=(0,1,2),header=(0), chunksize=1000)
    for chunk in df:
        chunk.to_csv(os.path.join(folder, new_folder,
                                  "new_file_" + filename), 
                                  columns = [['TIME','STUFF']])

推荐答案

解决方案:

header = True
for chunk in chunks:

    chunk.to_csv(os.path.join(folder, new_folder, "new_file_" + filename),
        header=header, cols=[['TIME','STUFF']], mode='a')

    header = False

注意:

mode='a'告诉熊猫追加.
我们只在第一个块上写一个列标题.

The mode='a' tells pandas to append.
We only write a column header on the first chunk.

这篇关于将大 pandas 数据帧分块写入CSV文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将大 pandas 数据帧分块写入CSV文件 [英] Writing large Pandas Dataframes to CSV file in chunks

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将大 pandas 数据帧分块写入CSV文件 [英] Writing large Pandas Dataframes to CSV file in chunks

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭