python如何检查打开和关闭标签 [英] python how to check for open and close tags

查看:39
本文介绍了python如何检查打开和关闭标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个类似 XML 的文件,但它不是 XML 或 HTML.

i have a XML like file but it is not XML or HTML.

文件示例:

<config-file>
   name myconfig
   date 3-2-2016
</config-file>
  <client>
   <"ABC - CDE & 123">
   </"ABC - CDE & 123">
  </client>

我们经常编辑这个文件,搞乱打开或关闭.要么不关闭,要么错过地方 '<'或>".试图找到一种解析文件的好方法,以确保它已打开和关闭.我在想:

We often edit this file and mess up the open or close. Either does not close or even miss place '<' or '>'. Trying to find a good way to parse the file to make sure it has opens and closes. I was thinking of:

1-循环遍历每一行并记录它是否以<代码><+ 任何字符 > 并确保它有一个结束</+ any characters> 如果没有,则抛出模式错误.

1-looping thru each line and record if it starts with < + any characters > and making sure it has a closing </ + any characters> and if it does not, it throws an error of the pattern.

欢迎任何帮助.

推荐答案

您已经掌握了基础知识.您只关心三种情况:

You have the basics. You care about three cases only:

  1. 开始标记
  2. 结束标记
  3. 其他一切(忽略)

使用正则表达式查找开始 &结束标签;确保 begin 表达式排除斜杠作为第二个字符.现在,制作一个简单的堆栈:一个字符串列表就可以了.此列表将包含打开的标签.

Use regular expressions to find the begin & end tags; make sure that the begin expression excludes a slash as the second character. Now, make a simple stack: a list of strings will do. This list will hold the open tags.

操作:

  • begin 标签:提取标签(去掉尖括号).将其推到列表的前面.
  • 结束标签:提取标签(去掉尖括号和前导斜线).检查此标签是否与列表的前面相同.如果是这样,弹出它.如果不是,则发出错误消息.如果列表中没有任何内容,则有人在没有打开的情况下尝试关闭标签;发布消息.
  • EOF:当输入用完时,检查列表.任何剩余的字符串都是未闭合的标签.发布消息.

请注意,这也为您提供了一些恢复的可能性.您可以扫描列表以查看无效的关闭标签是否与堆栈中更远的内容匹配.这表示重叠块.您可以查找接近匹配项,提示拼写错误.如果您关闭而没有可能打开,您可以发出一条消息并忽略它.这些步骤让您有机会发现多个错误.

Note that this also allows you some recovery possibilities. You can scan the list to see whether an invalid close tag matches something farther down the stack. This indicates overlapping blocks. You can look for a close match, suggesting a misspelling. If you get a close with no possible open, you can issue a message and ignore it. These steps give you a chance to find multiple errors.

哦,这到底是怎么回事......我已经这样做了足够多的时间......

Oh, what the heck ... I've done this enough times ...

stack = []

with open("parse_test_1.txt", 'r') as parse_file:
    for line in parse_file:
        print "INPUT LINE:", line
        ltag = line.find('<')
        if ltag > -1:
            rtag = line.find('>')
            if rtag > -1:
                # Found left and right brackets: grab tag
                tag = line[ltag+1: rtag]
                open_tag = tag[0] != '/'
                if open_tag:
                    # Add tag to stack
                    stack.append(tag)
                    print "TRACE open", stack
                else:
                    tag = tag[1:]
                    if len(stack) == 0:
                        print "No blocks are open; tried to close", tag
                    else:
                        if stack[-1] == tag:
                            # Close the block
                            stack.pop()
                            print "TRACE close", tag, stack
                        else:
                            print "Tried to close", tag, "but most recent open block is", stack[0]
                            if tag in stack:
                                stack.remove(tag)
                                print "Prior block closed; continuing"

if len(stack):
    print "Blocks still open at EOF:", stack

这篇关于python如何检查打开和关闭标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆