从文件python中删除字符串和字符串之前的所有行 [英] remove string and all lines before string from file python
问题描述
我有一个文件名,其中包含数千行数据. 我正在读取文件名并对其进行编辑.
I have a filename with thousands of lines of data in it. I am reading in the filename and editing it.
以下标记大约等于或大于900行(随文件而异):
The following tag is about ~900 lines in or more (it varies per file):
<Report name="test" xmlns:cm="http://www.domain.org/cm">
我需要在几个文件中删除该行及其之前的所有内容. 所以我需要代码来搜索该标签并删除它以及它上面的所有内容 它不会总是向下900行,它会有所不同;但是,标记将始终相同.
I need to remove that line and everything before it in several files. so I need to the code to search for that tag and delete it and everything above it it will not always be 900 lines down, it will vary; however, the tag will always be the same.
我已经有了读取行并写入文件的代码.我只需要找到该行并删除它以及它之前的所有内容的逻辑即可.
I already have the code to read in the lines and write to a file. I just need the logic behind finding that line and removing it and everything before it.
我尝试逐行读取文件,然后在命中该字符串后将其写入新文件,但是逻辑不正确:
I tried reading the file in line by line and then writing to a new file once it hits on that string, but the logic is incorrect:
readFile = open(firstFile)
lines = readFile.readlines()
readFile.close()
w = open('test','w')
for item in lines:
if (item == "<Report name="test" xmlns:cm="http://www.domain.org/cm">"):
w.writelines(item)
w.close()
此外,每个文件中的确切字符串都不相同.值"test"将有所不同.我也许需要检查标签名称"
任何帮助将不胜感激.
In addition, the exact string will not be the same in each file. The value "test" will be different. I perhaps need to check for the tag name ""
Any help will be much appreciated.
推荐答案
您可以使用tag_found
这样的标志来检查何时将行写入输出.您最初将标志设置为False
,然后在找到正确的标记后将其更改为True
.当标志为True
时,将行复制到输出文件.
You can use a flag like tag_found
to check when lines should be written to the output. You initially set the flag to False
, and then change it to True
once you've found the right tag. When the flag is True
, you copy the line to the output file.
TAG = '<Report name="test" xmlns:cm="http://www.domain.org/cm">'
tag_found = False
with open('tag_input.txt') as in_file:
with open('tag_output.txt', 'w') as out_file:
for line in in_file:
if not tag_found:
if line.strip() == TAG:
tag_found = True
else:
out_file.write(line)
PS:with open(filename) as in_file:
语法正在使用Python所谓的上下文管理器"-请参见
PS: The with open(filename) as in_file:
syntax is using what Python calls a "context manager"- see here for an overview. The short explanation of them is that they automatically take care of closing the file safely for you when the with:
block is finished, so you don't have to remember to put in my_file.close()
statements.
这篇关于从文件python中删除字符串和字符串之前的所有行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!