解析没有根元素的XML流 [英] Parsing an XML stream with no root element

查看:176
本文介绍了解析没有根元素的XML流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要解析连续的格式良好的XML元素,我只给它一个已构造的 java.io.Reader 对象。这些元素没有包含在根元素中,它们也没有像<?xml version =1.0?>这样的XML标题前缀,但在其他方面是有效的XML。

I need to parse a continuous stream of well-formed XML elements, to which I am only given an already constructed java.io.Reader object. These elements are not enclosed in a root element, nor are they prepended with an XML header like <?xml version="1.0"?>", but are otherwise valid XML.

使用Java org.xml.sax.XMLReader 类不起作用,因为XML Reader需要解析格式良好的XML,从一个封闭的根元素开始。因此,它只读取流中的第一个元素,它将其视为根,并在下一个元素中失败,具有典型的

Using the Java org.xml.sax.XMLReader class does not work, because the XML Reader expects to parse well-formed XML, starting with an enclosing root element. So, it just reads the first element in the stream, which it perceives as the root, and fails in the next one, with the typical


org.xml.sax.SAXParseException:根元素后面的文档中的标记必须格式正确。

org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.

对于不包含根元素的文件,但是这样的元素确实存在或者可以定义(并且被称为MyRootElement),可以执行以下操作:

For files that do not contain a root element, but where such element does exist or can be defined (and is called, say, MyRootElement), one can do something like the following:

        Strint path = <the full path to the file>;

        XMLReader xmlReader = SAXParserFactory.newInstance().newSAXParser().getXMLReader();

        StringBuilder buffer = new StringBuilder();

        buffer.append("<?xml version=\"1.0\"?>\n");
        buffer.append("<!DOCTYPE MyRootElement ");
        buffer.append("[<!ENTITY data SYSTEM \"file:///");
        buffer.append(path);
        buffer.append("\">]>\n");
        buffer.append("<MyRootElement xmlns:...>\n");
        buffer.append("&data;\n");
        buffer.append("</MyRootElement>\n");

        InputSource source = new InputSource(new StringReader(buffer.toString()));

        xmlReader.parse(source);

我已经通过保存部分 java.io.Reader测试了上述内容输出到文件并且可以正常工作。但是,这种方法在我的情况下不适用,并且无法插入此类额外信息(XML标头,根元素),因为 java.io.Reader 对象传递给了我的代码已经构建好了。

I have tested the above by saving part of the java.io.Reader output to a file and it works. However, this approach is not applicable in my case and such extra information (XML header, root element) cannot be inserted, since the java.io.Reader object passed to my code is already constructed.

基本上,我正在寻找碎片式XML解析。所以,我的问题是,可以使用标准Java API(包括 org.sax.xml。* java.xml。*)来完成。 包)?

Essentially, I am looking for "fragmented XML parsing". So, my question is, can it be done, using standard Java APIs (including the org.sax.xml.* and java.xml.* packages)?

推荐答案

SequenceInputStream来救援:

SequenceInputStream comes to the rescue:

    SAXParserFactory saxFactory = SAXParserFactory.newInstance();
    SAXParser parser = saxFactory.newSAXParser();

    parser.parse(
        new SequenceInputStream(
            Collections.enumeration(Arrays.asList(
            new InputStream[] {
                new ByteArrayInputStream("<dummy>".getBytes()),
                new FileInputStream(file),//bogus xml
                new ByteArrayInputStream("</dummy>".getBytes()),
            }))
        ), 
        new DefaultHandler()
    );

这篇关于解析没有根元素的XML流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆