如何仅读取特定字符串后的文本文件中的行? [英] How to only read lines in a text file after a certain string?
问题描述
我想将字典中所有在特定字符串之后的所有行都读到字典中.我想对数千个文本文件执行此操作.
I'd like to read to a dictionary all of the lines in a text file that come after a particular string. I'd like to do this over thousands of text files.
我能够使用以下代码(从此答案中获得)识别并打印出特定字符串('Abstract'
)):
I'm able to identify and print out the particular string ('Abstract'
) using the following code (gotten from this answer):
for files in filepath:
with open(files, 'r') as f:
for line in f:
if 'Abstract' in line:
print line;
但是我如何告诉Python开始读取仅在字符串之后出现的行?
But how do I tell Python to start reading the lines that only come after the string?
推荐答案
当您到达要从其开始的行时,只需开始另一个循环即可:
Just start another loop when you reach the line you want to start from:
for files in filepath:
with open(files, 'r') as f:
for line in f:
if 'Abstract' in line:
for line in f: # now you are at the lines you want
# do work
文件对象是其自己的迭代器,因此当我们到达其中带有'Abstract'
的行时,我们将从该行继续进行迭代,直到消耗完迭代器为止.
A file object is its own iterator, so when we reach the line with 'Abstract'
in it we continue our iteration from that line until we have consumed the iterator.
一个简单的例子:
gen = (n for n in xrange(8))
for x in gen:
if x == 3:
print('Starting second loop')
for x in gen:
print('In second loop', x)
else:
print('In first loop', x)
产生:
In first loop 0
In first loop 1
In first loop 2
Starting second loop
In second loop 4
In second loop 5
In second loop 6
In second loop 7
您还可以使用 itertools.dropwhile 进行消费所需的线条:
You can also use itertools.dropwhile to consume the lines up to the point you want:
from itertools import dropwhile
for files in filepath:
with open(files, 'r') as f:
dropped = dropwhile(lambda _line: 'Abstract' not in _line, f)
next(dropped, '')
for line in dropped:
print(line)
这篇关于如何仅读取特定字符串后的文本文件中的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!