Python - 一次从文件中读取 1000 行 [英] Python - read 1000 lines from a file at a time
问题描述
I've checked this, this and this.
第三个链接似乎有答案,但它没有完成工作.
The 3rd link seemed to have the answer yet it didn't do the job.
我无法找到将整个文件放入主内存的解决方案,因为我将处理的文件非常大.所以我决定使用 islice
如第三个链接所示.前 2 个链接无关紧要,因为它们仅用于 2 行或读取 1000 个字符.而我需要 1000 行. 现在 N 是 1000
I can't have a solution where the whole file is brought to main memory, as the files I'll be working with will be very large. So I decided to use islice
as shown in the 3rd link. First 2 links were irrelevant as they used it for only 2 lines or read 1000 characters. Whereas I need 1000 lines. for now N is 1000
我的文件包含一百万行:
示例:
1 1 1
1 2 1
1 3 1
1 4 1
1 5 1
1 6 1
1 7 1
1 8 1
1 9 1
1 10 1
所以如果我一次读取 1000 行,我应该遍历 while
1000
次,但是当我打印 p
来检查我经历了多少次,它并没有停在 1000
处.在运行我的程序 1400
秒后,它达到了 19038838
!!
So if I read 1000 lines at a time, I should go through the while
1000
times, yet when I print p
to check how many times I've been in through, it doesn't stop at a 1000
. It reached 19038838
after running my program for 1400
seconds!!
代码:
def _parse(pathToFile, N, alg):
p = 1
with open(pathToFile) as f:
while True:
myList = []
next_N_lines = islice(f, N)
if not next_N_lines:
break
for line in next_N_lines:
s = line.split()
x, y, w = [int(v) for v in s]
obj = CoresetPoint(x, y)
Wobj = CoresetWeightedPoint(obj, w)
myList.append(Wobj)
a = CoresetPoints(myList)
client.compressPoints(a) // This line is not the problem
print(p)
p = p+1
c = client.getTotalCoreset()
return c
我做错了什么?
推荐答案
正如@Ev.kounis 所说,您的 while 循环似乎无法正常工作.
As @Ev.kounis said your while loop doesn't seem to work properly.
我建议在这样的时间为大量数据使用 yield 函数:
I would recommend to go for the yield function for chunk of data at a time like this:
def get_line():
with open('your file') as file:
for i in file:
yield i
lines_required = 1000
gen = get_line()
chunk = [next(gen) for i in range(lines_required)]
这篇关于Python - 一次从文件中读取 1000 行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!