如何在python中拆分一个巨大的文本文件 [英] How do I split a huge text file in python

查看:22
本文介绍了如何在python中拆分一个巨大的文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的文本文件 (~1GB),遗憾的是我使用的文本编辑器无法读取这么大的文件.但是,如果我可以将它分成两到三个部分,我会很好,所以,作为练习,我想用 python 编写一个程序来完成它.

I have a huge text file (~1GB) and sadly the text editor I use won't read such a large file. However, if I can just split it into two or three parts I'll be fine, so, as an exercise I wanted to write a program in python to do it.

我想我想让程序做的是找到一个文件的大小,把这个数字分成几部分,对于每一部分,分块读取到那个点,写入一个文件名.nnn 输出文件,然后读取到下一个换行符并写入,然后关闭输出文件,等等.显然最后一个输出文件只是复制到输入文件的末尾.

What I think I want the program to do is to find the size of a file, divide that number into parts, and for each part, read up to that point in chunks, writing to a filename.nnn output file, then read up-to the next line-break and write that, then close the output file, etc. Obviously the last output file just copies to the end of the input file.

你能帮我解决与文件系统相关的关键部分:文件大小、分块读取和写入以及读取到换行符吗?

Can you help me with the key filesystem related parts: filesize, reading and writing in chunks and reading to a line-break?

我将首先编写此代码测试,所以没有必要给我一个完整的答案,除非它是单行的 ;-)

I'll be writing this code test-first, so there's no need to give me a complete answer, unless its a one-liner ;-)

推荐答案

查看 os.stat() 文件大小和 file.readlines([sizehint]).这两个函数应该是你阅读部分所需要的,希望你知道如何进行写作:)

Check out os.stat() for file size and file.readlines([sizehint]). Those two functions should be all you need for the reading part, and hopefully you know how to do the writing :)

这篇关于如何在python中拆分一个巨大的文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆