如何提取大文本文件中两个唯一词之间的信息 [英] How to extract information between two unique words in a large text file
问题描述
我有大约 150 个填充字符信息的文本文件.每个文件都包含两个唯一的单词 ()alpha 和 bravo,我想提取这些唯一单词之间的文本并将其写入不同的文件.
我可以手动按 CTRL+F 来输入这两个词并复制它们之间的文本,我只想知道如何使用一个程序(最好是 Python)来处理许多文件.
你可以使用正则表达式.
<预><代码>>>>st = "alpha 这是我的文本 bravo">>>进口重新>>>re.findall(r'alpha(.*?)bravo',st)['这是我的文字']我的 test.txt 文件
alpha 这是我的线路伊皮厉害了
现在使用 open 来读取文件而不是应用 正则表达式
.
I have about 150 text files filled with character information. Each file contains two unique words ()alpha and bravo and i want to extract the text between these unique words and write it to a different file.
Manually i can CTRL+F for the two words and copy the text between, i just want to know how to do this using a program (preferably Python) for many files.
You can use regular expressions for that.
>>> st = "alpha here is my text bravo"
>>> import re
>>> re.findall(r'alpha(.*?)bravo',st)
[' here is my text ']
My test.txt file
alpha here is my line
yipee
bravo
Now using open to read the file and than applying regular expressions
.
>>> f = open('test.txt','r')
>>> data = f.read()
>>> x = re.findall(r'alpha(.*?)bravo',data,re.DOTALL)
>>> x
[' here is my line
yipee
']
>>> "".join(x).replace('
',' ')
' here is my line yipee '
>>>
这篇关于如何提取大文本文件中两个唯一词之间的信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!