在Python中,f.readlines()和list(f)之间的区别是什么? [英] In Python, what is the difference between f.readlines() and list(f)
问题描述
从 Python2教程和< Python3教程中,有一条线在中点第7.2.1节说:
如果您要读取文件的所有行在列表中,也可以使用
解决方案list(f)
或f.readlines()。
所以我的问题是:这两种方法将文件对象转换为列表有什么区别?我很好奇在性能方面和在Python对象实现下(也许是Python2和Python3之间的区别)。
在功能上,没有区别;这两个方法的结果都是完全相同的列表。
明智的做法是使用文件对象作为 iterator (直接调用 next(f)
直到 StopIteration
被引发),另一个使用专门的方法读取整个文件。
Python 2和Python 3的区别在于,除非你使用 io.open()$ c $在Python 2. Python 2文件对象使用隐藏缓冲区用于文件迭代,如果将文件对象迭代和
.readline()
或 .readlines()
来电。
$ b $ io
库(处理Python 3中的所有文件I / O)不使用这样一个隐藏的缓冲区,所有的缓冲都由一个 BufferedIOBase()来处理,
包装类。实际上, io.IOBase .readlines()
实现将文件对象用作引擎盖下的迭代器 和 TextIOWrapper
迭代委托给 TextIOWrapper.readline )
,所以 list(f)
和 f.readlines()
基本上是同样的东西,真的。
性能方面,即使在Python 2中也没有什么区别,因为瓶颈是文件I / O。你能从磁盘读取多快?在微观层面上,性能可能取决于其他因素,比如操作系统是否已经缓冲了数据,行数是多少。
From both Python2 Tutorial and Python3 Tutorial, there is a line in the midpoint of section 7.2.1 saying:
If you want to read all the lines of a file in a list you can also use
list(f)
orf.readlines().
So my question is: What is the difference between these two ways to turn a file object to a list? I am curious both in performance aspect and in underneath Python object implementation (and maybe the difference between the Python2 and Python3).
Functionally, there is no difference; both methods result in the exact same list.
Implementation wise, one uses the file object as an iterator (calls next(f)
repeatedly until StopIteration
is raised), the other uses a dedicated method to read the whole file.
Python 2 and 3 differ in what that means, exactly, unless you use io.open()
in Python 2. Python 2 file objects use a hidden buffer for file iteration, which can trip you up if you mix file object iteration and .readline()
or .readlines()
calls.
The io
library (which handles all file I/O in Python 3) does not use such a hidden buffer, all buffering is instead handled by a BufferedIOBase()
wrapper class. In fact, the io.IOBase.readlines()
implementation uses the file object as an iterator under the hood anyway, and TextIOWrapper
iteration delegates to TextIOWrapper.readline()
, so list(f)
and f.readlines()
essentially are the same thing, really.
Performance wise, there isn't really a difference even in Python 2, as the bottleneck is file I/O; how quickly can you read it from disk. At a micro level, performance can depend on other factors, such as if the OS has already buffered the data and how long the lines are.
这篇关于在Python中,f.readlines()和list(f)之间的区别是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!