输入文件过大导致内存错误 [英] Memory error due to the huge input file size
问题描述
当我使用以下代码读取文件时:
lines=file("data.txt").read().split("\n")
我遇到以下错误
MemoryError
文件大小为
ls -l
-rw-r--r-- 1 charlie charlie 1258467201 Sep 26 12:57 data.txt
很明显,文件太大,无法一次读入内存.
为什么不只使用:
with open("data.txt") as myfile:
for line in myfile:
do_something(line.rstrip("\n"))
或者,如果您未使用Python 2.6和更高版本:
myfile = open("data.txt")
for line in myfile:
do_something(line.rstrip("\n"))
在两种情况下,您都将获得一个迭代器,该迭代器可以像对待字符串列表一样对待.
由于您将整个文件读取为一个大字符串然后在换行符上进行拆分的方式将删除过程中的换行符,因此我在示例中添加了.rstrip("\n")
以更好地模拟结果. /p>
When I using the following code to read file:
lines=file("data.txt").read().split("\n")
I have the following error
MemoryError
the file size is
ls -l
-rw-r--r-- 1 charlie charlie 1258467201 Sep 26 12:57 data.txt
Obviously the file is too large to be read into memory all at once.
Why not just use:
with open("data.txt") as myfile:
for line in myfile:
do_something(line.rstrip("\n"))
or, if you're not on Python 2.6 and higher:
myfile = open("data.txt")
for line in myfile:
do_something(line.rstrip("\n"))
In both cases, you'll get an iterator that can be treated much like a list of strings.
EDIT: Since your way of reading the entire file into one large string and then splitting it on newlines will remove the newlines in the process, I have added a .rstrip("\n")
to my examples in order to better simulate the result.
这篇关于输入文件过大导致内存错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!