从流输入中解析没有根元素的XML片段列表 [英] Parse a list of XML fragments with no root element from a stream input
问题描述
在Java中使用SAX api来解析流输入中没有根元素的XML片段列表是否可行?
Is it feasible in Java using the SAX api to parse a list of XML fragments with no root element from a stream input?
我尝试解析这样的XML但是得到了
I tried parsing such an XML but got a
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
甚至在触发endDocument事件之前。
before even the endDocument event was fired.
我不想解决明显但笨拙的解决方案,如预先添加自定义根元素或使用缓冲片段解析。
I would like not to settle with obvious but clumsy solutions as "Pre-append a custom root element or Use buffered fragment parsing".
我使用的是Java 1.6的标准SAX API。如果有人想知道的话,SAX工厂已经设置了Validating(假)。
I am using the standard SAX API of Java 1.6. The SAX factory had setValidating(false) in case anyone wondered.
推荐答案
首先,最重要的是,你的内容是解析不是XML文档。
来自 XML规范:
First, and most important of all, the content you are parsing is not an XML document. From the XML Specification:
[定义:只有一个元素,称为根或文档元素,其中没有一部分出现在内容中任何其他元素。]
[Definition: There is exactly one element, called the root, or document element, no part of which appears in the content of any other element.]
现在,关于用SAX解析这个问题 - 尽管你说的是笨拙 - 我建议以下方法:
Now, as to parsing this with SAX - in spite of what you said about clumsiness - I'd suggest the following approach:
Enumeration<InputStream> streams = Collections.enumeration(
Arrays.asList(new InputStream[] {
new ByteArrayInputStream("<root>".getBytes()),
yourXmlLikeStream,
new ByteArrayInputStream("</root>".getBytes()),
}));
SequenceInputStream seqStream = new SequenceInputStream(streams);
// Now pass the `seqStream` into the SAX parser.
使用 SequenceInputStream
是将多个输入流连接成单个流的便捷方式。它们将按照传递给构造函数的顺序读取(或者在这种情况下 - 由 Enumeration
返回)。
将它传递给你的SAX解析器,你就完成了。
Pass it to your SAX parser, and you are done.
这篇关于从流输入中解析没有根元素的XML片段列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!