如何从文本文件中仅获取标题名称 [英] How can I get only heading names.from the text file
本文介绍了如何从文本文件中仅获取标题名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个文本文件,如下所示:
I have a Text file as below:
Education:
askdjbnakjfbuisbrkjsbvxcnbvfiuregifuksbkvjb.iasgiufdsegiyvskjdfbsldfgd
Technical skills :
java,j2ee etc.,
work done:
oaugafiuadgkfjwgeuyrfvskjdfviysdvfhsdf,aviysdvwuyevfahjvshgcsvdfs,bvisdhvfhjsvjdfvshjdvhfjvxjhfvhjsdbvfkjsbdkfg
我只想提取标题名称,例如教育,技术技能等.
I would like to extract only the heading names such as Education,Technical Skills etc.
代码是:
with open("aks.txt") as infile, open("fffm",'w') as outfile:
copy = False
for line in infile:
if line.strip() == "Technical Skills":
copy =True
elif line.strip() == "Workdone":
copy = True
elif line.strip() == "Education":
copy = False
elif copy:
outfile.write(line)
fh = open("fffm.txt", 'r')
contents = fh.read()
len(contents)
推荐答案
如果您确定标题名称位于冒号(:)之前,则可以编写正则表达式来搜索这种模式.
If you are sure that the title names occure before a colon (:) then you can write a regex to search for such a pattern.
import re
with open("aks.txt") as infile:
for s in re.finditer(r'(?<=\n).*?(?=:)',infile.read()):
print s.group()
输出将类似于
Education
Technical skills
work done
这篇关于如何从文本文件中仅获取标题名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文