使用版本属性解析要用于XML文档的XML Schema版本 [英] Resolving which version of an XML Schema to use for XML documents with a version attribute

查看:136
本文介绍了使用版本属性解析要用于XML文档的XML Schema版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须编写一些代码来处理读取和验证在其根元素中使用version属性声明版本号的XML文档,如下所示:

I have to write some code to handle reading and validating XML documents that use a version attribute in their root element to declare a version number, like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
<Junk xmlns="urn:com:initech:tps" 
    xmlns:xsi="http://www3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="urn:com:initech.tps:schemas/foo/Junk.xsd"
    VersionAttribute="2.0">

有一堆嵌套的架构,我的代码有一个 org.w3c.dom.ls.LsResourceResolver 找出要使用的模式,并实现此方法:

There are a bunch of nested schemas, my code has an org.w3c.dom.ls.LsResourceResolver to figure out what schema to use, implementing this method:

LSInput resolveResource(String type,
                        String namespaceURI,
                        String publicId,
                        String systemId,
                        String baseURI)

模式的先前版本已将模式版本嵌入到名称空间中,因此我可以使用namespaceURI和systemId来确定要提供的模式.现在,版本号已切换到根元素中的属性,而我的解析器无权访问该属性.我应该如何找出LsResourceResolver中XML文档的版本?

Previous versions of the schema have embedded the schema version into the namespace, so I could use the namespaceURI and systemId to decide which schema to provide. Now the version number has been switched to an attribute in the root element, and my resolver doesn't have access to that. How am I supposed to figure out the version of the XML document in the LsResourceResolver?

推荐答案

在此之前,我从未处理过架构版本,也不知道涉及了什么.当版本是名称空间的一部分时,我可以将所有模式放在一起,并加以整理,但是由于根元素中的版本和各版本之间共享的名称空间,无法从XML中读取版本信息了. em>之前开始SAX解析.

I had never had to deal with schema versions before this and had no idea what was involved. When the version was part of the namespace then I could throw all the schemas in together and let them get sorted out, but with the version in the root element and namespace shared across versions there is no getting around reading the version information from the XML before starting the SAX parsing.

我要做的事情与Pangea的建议非常相似(从我这里+1),但是我不能完全遵循建议,因为该文档太大,无法将其全部读入内存,甚至一次.通过使用STAX,我可以最大程度地减少从文件中获取版本的工作量.请参阅此DeveloperWorks文章:

I'm going to do something very similar to what Pangea suggested (gets +1 from me), but I can't follow the advice exactly because the document is too big to read it all into memory, even once. By using STAX I can minimize the amount of work done to get the version from the file. See this DeveloperWorks article, "Screen XML documents efficiently with StAX":

对XML文档进行筛选或分类是一个常见问题, 特别是在XML中间件中.将XML文档路由到特定的 处理者可能需要分析文档类型和 文件内容.这里的问题是获取所需的 来自文档的信息,开销最小. 诸如DOM或SAX之类的传统解析器并不适合于此 任务.例如,DOM会分析整个文档并构造一个 内存中的完整文档树,然后将控制权返回给 客户.即使是采用延迟节点扩展的DOM解析器,因此 能够部分解析文档,对资源有很高的要求 因为文档树必须至少部分地构造在 记忆.出于筛选目的,这是完全不可接受的.

The screening or classification of XML documents is a common problem, especially in XML middleware. Routing XML documents to specific processors may require analysis of both the document type and the document content. The problem here is obtaining the required information from the document with the least possible overhead. Traditional parsers such as DOM or SAX are not well suited to this task. DOM, for example, parses the whole document and constructs a complete document tree in memory before it returns control to the client. Even DOM parsers that employ deferred node expansion, and thus are able to parse a document partially, have high resource demands because the document tree must be at least partially constructed in memory. This is simply not acceptable for screening purposes.

获取版本信息的代码如下:

The code to get the version information will look like:

def map = [:]
def startElementCount = 0
def inputStream = new File(inputFile).newInputStream()
try {
    XMLStreamReader reader = 
        XMLInputFactory.newInstance().createXMLStreamReader(inputStream)
    for (int event; (event = reader.next()) != XMLStreamConstants.END_DOCUMENT;) {
        if (event == XMLStreamConstants.START_ELEMENT) {
            if (startElementCount > 0) return map
            startElementCount += 1
            map.rootElementName = reader.localName
            for (int i = 0; i < reader.attributeCount; i++) {
                if (reader.getAttributeName(i).toString() == 'VersionAttribute') {
                    map.versionIdentifier = reader.getAttributeValue(i).toString()
                    return map
                }
            }
        }
    }   
} finally {
    inputStream.close()
}

然后,我可以使用版本信息来确定要使用的解析器以及在SaxFactory上设置的架构文件.

Then I can use the version information to figure out what resolver to use and what schema documents to set on the SaxFactory.

这篇关于使用版本属性解析要用于XML文档的XML Schema版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆