如何阅读和组织按关键字划分的文本文件 [英] How to read and organize text files divided by keywords

查看:32
本文介绍了如何阅读和组织按关键字划分的文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理这个读取文本文件的代码(在 python 上).文本文件包含构造某个几何图形的信息,并使用关键字将其分段,例如,文件:

I'm working on this code (on python) that reads a text file. The text file contains information to construct a certain geometry, and it is separated by sections by using keywords, for example, the file:

*VERTICES
1 0 0 0
2 10 0 0
3 10 10 0
4 0 10 0
*EDGES
1 1 2
2 1 4
3 2 3
4 3 4

包含顶点位于 (0,0)、(0,10)、(10,0)、(10,10) 的正方形的信息.*Edges"部分定义了顶点之间的连接.每行的第一个数字是一个 ID 号.

contains the information of a square with vertices at (0,0), (0,10), (10,0), (10,10). The "*Edges" part defines the connection between the vertices. The first number in each row is an ID number.

这是我的问题,文本文件中的信息不一定按顺序排列,有时顶点"部分会先出现,而有时边缘"部分会先出现.我还有其他关键字,所以我尽量避免重复 if 语句来测试每一行是否有一个新关键字.

Here is my problem, the information in the text file is not necessarily in order, sometimes the "Vertices" section appears first, and some other times the "Edges" section will come first. I have other keywords as well, so I'm trying to avoid repeating if statements to test if each line has a new keyword.

我一直在做的是多次阅读文本文件,每次寻找不同的关键字:

What I have been doing is reading the text file multiple times, each time looking for a different keyword:

open file
read line by line
if line == *Points
store all the following lines in a list until a new *command is encountered
close file
open file (again)
read line by line
if line == *Edges
store all the following lines in a list until a new *command is encountered
close file
open file (again)
...

谁能指出我如何在没有如此繁琐的程序的情况下识别这些关键字?谢谢.

Can someone point out how can I identify these keywords without such a tedious procedure? Thanks.

推荐答案

您可以读取文件一次并将内容存储在 字典.由于您已经方便地用 * 标记了命令行"行,因此您可以使用所有以 * 开头的行作为字典键,并将所有后续行用作该键的值.你可以用 for 循环来做到这一点:

You can read the file once and store the contents in a dictionary. Since you have conveniently labeled the "command" lines with a *, you can use all lines beginning with a * as the dictionary key and all following lines as the values for that key. You can do this with a for loop:

with open('geometry.txt') as f:
    x = {}  
    key = None  # store the most recent "command" here
    for y in f.readlines()
        if y[0] == '*':
            key = y[1:] # your "command"
            x[key] = []
        else:
            x[key].append(y.split()) # add subsequent lines to the most recent key

或者你可以利用python的列表和字典推导在一行中做同样的事情:

Or you can take advantage of python's list and dictionary comprehensions to do the same thing in one line:

with open('test.txt') as f:
    x = {y.split('\n')[0]:[z.split() for z in y.strip().split('\n')[1:]] for y in f.read().split('*')[1:]}

我承认这看起来不是很好看,但它通过将整个文件分成 '*' 字符之间的块,然后使用新行和空格作为分隔符将剩余的块分解为字典键和列表列表(作为字典值).

which I'll admit is not very nice looking but it gets the job done by splitting the entire file into chunks between '*' characters and then using new lines and spaces as delimiters to break up the remaining chunks into dictionary keys and lists of lists (as dictionary values).

有关拆分、剥离和切片字符串的详细信息可以在此处

Details about splitting, stripping, and slicing strings can be found here

这篇关于如何阅读和组织按关键字划分的文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆