python sax错误“文档元素后的垃圾" [英] python sax error "junk after document element"

查看：32 发布时间：2021/7/15 18:31:58 python sax

本文介绍了python sax错误“文档元素后的垃圾"的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用 python sax 来解析 xml 文件.xml文件实际上是多个xml文件的组合.它看起来像这样:

I use python sax to parse xml file. The xml file is actually a combination of multiple xml files. It looks like as follows:

<row name="abc" age="40" body="blalalala..." creationdate="03/10/10" />
<row name="bcd" age="50" body="blalalala..." creationdate="03/10/09" />

我的python代码如下.它显示文档元素后的垃圾"错误.任何解决这个问题的好主意.谢谢.

My python code is in the following. It show "junk after document element" error. Any good idea to solve this problem. Thanks.

from xml.sax.handler import ContentHandler
from xml.sax import make_parser,SAXException
import sys

class PostHandler (ContentHandler):
    def __init__(self):
        self.find = 0
        self.buffer = ''
        self.mapping={}
    def startElement(self,name,attrs):
        if name == 'row':
             self.find = 1
             self.body = attrs["body"]
             print attrs["body"]
    def character(self,data):
        if self.find==1:
             self.buffer+=data
    def endElement(self,name):
        if self.find == 1:
             self.mapping[self.body] = self.buffer
             print self.mapping
parser = make_parser()
handler = PostHandler()
parser.setContentHandler(handler)
try:
    parser.parse(open("2.xml"))
except SAXException:

推荐答案

xmldata = '''
<row name="abc" age="40" body="blalalala..." creationdate="03/10/10" />
<row name="bcd" age="50" body="blalalala..." creationdate="03/10/09" />
'''

在数据周围添加一个包装标签.我使用过 ElementTree，因为它非常简单，但您可以在任何解析器上执行相同的操作:

Add a wrapper tag around the data. I've used ElementTree since it's so simpler, but you'd be able to do the same on any parser:

from xml.etree import ElementTree as etree

# wrap the data
xmldata = '<rows>' +data + '</rows>'

rows = etree.fromstring(xmldata)
for row in rows:
    print row.attrib

结果

{'age': '40',
 'body': 'blalalala...',
 'creationdate': '03/10/10',
 'name': 'abc'}
{'age': '50',
 'body': 'blalalala...',
 'creationdate': '03/10/09',
 'name': 'bcd'}

这篇关于python sax错误“文档元素后的垃圾"的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

python sax错误“文档元素后的垃圾" [英] python sax error "junk after document element"

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python sax错误“文档元素后的垃圾" [英] python sax error &quot;junk after document element&quot;

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

python sax错误“文档元素后的垃圾" [英] python sax error "junk after document element"

登录关闭