如何在 Python 中逐行读取大型文本文件,而不将其加载到内存中? [英] How can I read large text files in Python, line by line, without loading it into memory?

查看:70
本文介绍了如何在 Python 中逐行读取大型文本文件,而不将其加载到内存中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要逐行读取一个大文件.假设文件超过 5GB,我需要读取每一行,但显然我不想使用 readlines() 因为它会在内存中创建一个非常大的列表.

I need to read a large file, line by line. Lets say that file has more than 5GB and I need to read each line, but obviously I do not want to use readlines() because it will create a very large list in the memory.

下面的代码如何在这种情况下工作?xreadlines 本身是不是一一读入内存?是否需要生成器表达式?

How will the code below work for this case? Is xreadlines itself reading one by one into memory? Is the generator expression needed?

f = (line for line in open("log.txt").xreadlines())  # how much is loaded in memory?

f.next()  

另外,我该怎么做才能像 Linux tail 命令一样以相反的顺序阅读?

Plus, what can I do to read this in reverse order, just as the Linux tail command?

我发现:

http://code.google.com/p/pytailer/

"python head、tail和向后按文本文件的行读取"

两者都工作得很好!

推荐答案

我提供了这个答案,因为 Keith 的虽然简洁,但没有明确地

I provided this answer because Keith's, while succinct, doesn't close the file explicitly

with open("log.txt") as infile:
    for line in infile:
        do_something_with(line)

这篇关于如何在 Python 中逐行读取大型文本文件,而不将其加载到内存中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆