如何在python中拆分csv文件? [英] How can I split csv files in python?
本文介绍了如何在python中拆分csv文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
由于内存错误,我必须拆分我的csv文件.我做了研究.我从堆栈溢出用户之一Aziz Alto找到了它.这是他的代码.
Because of the memory error, i have to split my csv files. I did research it. I found it from one of the stack overflow user who is Aziz Alto. This is his code.
csvfile = open('#', 'r').readlines()
filename = 1
for i in range(len(csvfile)):
if i % 10000000 == 0:
open(str(filename) + '.csv', 'w+').writelines(csvfile[i:i+10000000])
filename += 1
它工作正常,但对于第二个文件,代码未添加标头,这对我来说非常重要.我的问题是如何添加第二个文件的标题?
It works well but for second file, the code did not add header which is very important for me. My question is that How can I add header for second file?
推荐答案
在第二到最后一个文件上,您必须始终添加原始文件的第一行(包含标题的那一行):
On the 2nd till last file you have to always add the 1st line of your original file (the one containing the header):
# this loads the first file fully into memory
with open('#', 'r') as f:
csvfile = f.readlines()
linesPerFile = 1000000
filename = 1
# this is better then your former loop, it loops in 1000000 lines a peice,
# instead of incrementing 1000000 times and only write on the millionth one
for i in range(0,len(csvfile),linesPerFile):
with open(str(filename) + '.csv', 'w+') as f:
if filename > 1: # this is the second or later file, we need to write the
f.write(csvfile[0]) # header again if 2nd.... file
f.writelines(csvfile[i:i+linesPerFile])
filename += 1
这篇关于如何在python中拆分csv文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文