用Python读大文件的懒惰方法? [英] Lazy Method for Reading Big File in Python?
问题描述
所以我想一块一块地阅读它,并在处理完每一块后,把加工好的块存储到另一个文件中,然后阅读下一块。
是否有任何方法 yield
这些部分?
我希望有一个懒惰的方法。为了编写一个懒惰的函数,只需使用 html#generatorsrel =noreferrer> yield
:
def read_in_chunks(file_object,chunk_size = 1024):
懒惰的函数(发生器)逐个读取文件
默认块大小:1k。
而真:
data = file_object.read(chunk_size)
如果不是数据:
break
yield数据
f = open(' (b):
$ b
另一个选项是u se iter
和一个帮助函数:
$ $ p $ f = open('really_big_file.dat')
def read1k():
(it)
$如果文件是基于行的,那么文件对象已经是一个延迟的行的生成器:
b
用于打开('really_big_file.dat')行:
process_data(行)
I have a very big file 4GB and when I try to read it my computer hangs.
So I want to read it piece by piece and after processing each piece store the processed piece into another file and read next piece.
Is there any method to yield
these pieces ?
I would love to have a lazy method.
解决方案 To write a lazy function, just use yield
:
def read_in_chunks(file_object, chunk_size=1024):
"""Lazy function (generator) to read a file piece by piece.
Default chunk size: 1k."""
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
f = open('really_big_file.dat')
for piece in read_in_chunks(f):
process_data(piece)
Another option would be to use iter
and a helper function:
f = open('really_big_file.dat')
def read1k():
return f.read(1024)
for piece in iter(read1k, ''):
process_data(piece)
If the file is line-based, the file object is already a lazy generator of lines:
for line in open('really_big_file.dat'):
process_data(line)
这篇关于用Python读大文件的懒惰方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!