在两个先前已知的字符串之间处理文件的Python方式 [英] Pythonic way of processing a file between two previously known strings
问题描述
我使用python处理日志文件.假设我有一个日志文件,其中包含START
行和END
行,如下所示:
I process log files with python. Let´s say that I have a log file that contains a line which is START
and a line that is END
, like below:
START
one line
two line
...
n line
END
我想要的是能够在START
和END
行之间存储内容以便进行进一步处理.
What I do want is to be able to store the content between the START
and END
lines for further processing.
我在Python中执行以下操作:
I do the following in Python:
with open (file) as name_of_file:
for line in name_of_file:
if 'START' in line: # We found the start_delimiter
print(line)
found_start = True
for line in name_of_file: # We now read until the end delimiter
if 'END' in line: # We exit here as we have the info
found_end=True
break
else:
if not (line.isspace()): # We do not want to add to the data empty strings, so we ensure the line is not empty
data.append(line.replace(',','').strip().split()) # We store information in a list called data we do not want ','' or spaces
if(found_start and found_end):
relevant_data=data
然后我处理relevant_data
.
对于Python的纯净度而言,它看起来非常复杂,因此我的问题是:还有一种更Python化的方式来做到这一点吗?
Looks to far complicated for the purity of Python, and hence my question: is there a more Pythonic way of doing this?
谢谢!
推荐答案
要执行此操作,您可以使用此中讨论的iter(callable, sentinel)
post ,它将一直读取直到达到前哨值为止,在您的情况下为"END"(应用.strip()
之后).
To perform that, you can use iter(callable, sentinel)
discussed in this post , that will read until a sentinel value is reached, in your case 'END' (after applying .strip()
).
with open(filename) as file:
start_token = next(l for l in file if l.strip()=='START') # Used to read until the start token
result = [line.replace(',', '').split() for line in iter(lambda x=file: next(x).strip(), 'END') if line]
这篇关于在两个先前已知的字符串之间处理文件的Python方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!