Python:逐行读取文件的最佳方法 [英] Python: Most optimal way to read file line by line
问题描述
我需要读取一个很大的输入文件,所以我不想使用 enumerate
或 fo.readlines()
.以传统方式
I have a large input file I need to read from so I don't want to use enumerate
or fo.readlines()
.
for line in fo:
in the traditional way won't work and I'll state why, but I feel some modification to that is what I need right now. Consider the following file:
input_file.txt:
3 # No of tests that will follow
3 # No of points in current test
1 # 1st x-coordinate
2 # 2nd x-coordinate
3 # 3rd x-coordinate
2 # 1st y-coordinate
4 # 2nd y-coordinate
6 # 3rd y-coordinate
...
我需要的是能够读取可变的行块,将元组中的坐标配对,将元组添加到案例列表中,然后再返回以从文件中读取新案例.
What I need is to be able to read variable chunks of lines, pair the coordinates in tuple, add tuple to a list of cases and move back to reading a new case from the file.
我想到了这个
with open(input_file) as f:
T = int(next(f))
for _ in range(T):
N = int(next(f))
for i in range(N):
x.append(int(f.next()))
for i in range(N):
y.append(int(f.next()))
然后将两个列表耦合为一个元组.我觉得必须有一种更清洁的方法来做到这一点.有什么建议吗?
Then couple the two lists into a tuple. I feel there must be a cleaner way to do this. Any suggestions?
y坐标必须有一个单独的for循环才能读取.它们的x和y坐标相隔n行.所以读第一行;读行(i + n);重复n次-对于每种情况.
The y-coordinates will have to have a separate for loop to read. They are x and y coordinates are n lines apart. So Read line i; Read line (i+n); Repeat n times - for each case.
推荐答案
这可能不是最短的解决方案,但我认为它是相当理想的".
This might not be the shortest possible solution but I believe it is "pretty optimal".
def parse_number(stream):
return int(next(stream).partition('#')[0].strip())
def parse_coords(stream, count):
return [parse_number(stream) for i in range(count)]
def parse_test(stream):
count = parse_number(stream)
return list(zip(parse_coords(stream, count), parse_coords(stream, count)))
def parse_file(stream):
for i in range(parse_number(stream)):
yield parse_test(stream)
它会急切地解析单个测试的所有坐标,但是每个测试只会按照您的要求进行延迟解析.
It will eagerly parse all coordinates of a single test but each test will only be parsed lazily as you ask for it.
您可以像这样使用它来遍历测试:
You can use it like this to iterate over the tests:
if __name__ == '__main__':
with open('input.txt') as istr:
for test in parse_file(istr):
print(test)
可能需要更好的功能名称,以更好地区分急切功能和懒惰功能.我现在缺乏命名创造力.
Better function names might be desired to better distinguish eager from lazy functions. I'm experiencing a lack of naming creativity right now.
这篇关于Python:逐行读取文件的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!