使用Python元素树解析ASCII文本文件中的xml [英] Use Python Element Tree to parse xml in ASCII text file

查看:109
本文介绍了使用Python元素树解析ASCII文本文件中的xml的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有ASCII文本文件,其中包含XML部分。我尝试使用以下基本命令打开文件,但出现错误:

I have ASCII text files that contain XML sections in them. I try the following basic commands to open the file, but get an error:

import xml.etree.ElementTree as ET
tree = ET.parse('data_file.txt')

有没有办法我仍然可以使用Element Tree能够从文本文件中解析XML部分?

Is there a way I can still use Element Tree to be able to parse the XML sections out of the text file?

推荐答案

您不能使用ElementTree来解析文件并不是完整的XML格式。如果在XML文档的根元素之前或之后存在文本内容,则XML解析将失败,并且如果存在其他任何违反格式规范的行为,也会失败。

You cannot use ElementTree to parse a file that isn't in its entirety well-formed XML. If there is text content before or after the root element of the XML document, XML parsing will fail, as it will if there are any other infractions against well-formedness.

更一般而言,符合标准的XML解析器只能解析格式正确的XML。因此,您的情况实际上相当普遍。

More generally, standards-compliant XML parsers can parse only well-formed XML. So your scenario is actually fairly common.

一种方法是编写一个程序,该程序处理文件并尝试查找嵌入在其他内容中的XML,并处理文件的那部分与ElementTree一起使用。如果您的XML内容很简单,那么这是完全可行的。如果它很复杂,或者文本文件中嵌入了多个XML文档,它会变得更具挑战性,但仍然可行。

One approach would be to write a program that processes the file and attempts to find the XML embedded in the other content, and that handles that part of the file with ElementTree. If your XML content is simple, this is quite feasible. If it's complex, or if there is more than one XML document embedded in the text file, it gets a little more challenging, but it may still be doable.

这篇关于使用Python元素树解析ASCII文本文件中的xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆