解析文本文件 [英] Parsing Text Files

查看:183
本文介绍了解析文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是python的新手,正在寻找使用如下数据解析多个文本文件(〜5000):

I am new to python and am looking to parse several text files (~5000) with data that looks like this:

随机文本...
ID:ABC123456

random text ...
ID: ABC123456

随机文本...

标题

包含文字

结束

随机文本...

每个文件大约有3000行,我想将标题 End 行之间包含的ID和文本提取到一个csv文件中,帽子看起来像这样:

Each file has about 3000 lines, and I want to extract the ID and text contained between the lines Title and End into a csv file, hat looks something like this:

ID文字

ID Text

ABC123456包含的文本1

ABC123456 Contained Text 1

ABC123457包含文本2

ABC123457 Contained Text 2

将感谢您的帮助!

这是我所拥有的:

f = open("test.txt",'r')
while True:
    text = f.readline()
    if 'Title' in text:
        print text

推荐答案

尝试在readline行之后的while循环中放入类似的内容:

Try putting something like this in your while loop, after the readline line:

id = None
title_set = True
f = open("test.txt",'r')
while True:
    text = f.readline()
    if text.startswith("ID: "):
        id = text[4:].strip() # The strip() is to remove the newline
    if text == "End":
        title_set = False
    if text == "Title":
        title_set = True
    if title_set and id is not None:
        print(id + " " + text.strip())

这应该按照需要打印所有行(除非有一些格式设置).

This should print all your lines as you want them (barring some formatting).

将这些行写入另一个文件可归结为将print(...)替换为other_file.write(...),其中other_file是另一个文件的句柄,并已获得写入许可.

Writing these lines to another file comes down to replacing print(...) by other_file.write(...), where other_file is a handle of a different file, opened with writing permission.

这篇关于解析文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆