如何在Java中读取格式良好的XML,但是跳过架构? [英] How to read well formed XML in Java, but skip the schema?

查看:108
本文介绍了如何在Java中读取格式良好的XML,但是跳过架构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想阅读一个包含模式声明的XML文件。



这就是我想要做的,请阅读它。我不在乎它是否有效,但我希望它形成良好。



问题在于读者正在尝试读取模式文件,并且失败了。



我不想让它尝试。



我试过禁用验证,但它仍然坚持尝试阅读模式文件。



理想情况下,我想用库存Java 5 JDK来做这件事。



这是我到目前为止所用的,非常简单:

  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 
dbf.setValidating(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);

这是我要回的例外:

  java.lang.RuntimeException:java.io.IOException:服务器返回HTTP响应代码:503为URL:http://www.w3.org/TR/xhtml1/DTD/ xhtml1-transitional.dtd 

是的,这个HAPPENS是一个XHTML架构,但这不是一个XHTML问题,这是一个XML问题。只是指出这一点,所以人们不会分心。并且,在这种情况下,W3C基本上是在说不要求这个东西,这是一个愚蠢的想法,我同意。但是,这又是问题的细节,而不是问题的根源。我不想一直要求它。

解决方案

该参考不适用于架构 ,但对于 DTD



DTD文件不仅包含结构规则。它们还可以包含实体引用。 XML解析器有义务加载和解析DTD引用,因为它们可能包含可能影响文档解析方式的实体引用和文件内容(您可以为字符甚至整个文本短语提供实体引用)。 p>

如果您想避免加载和解析引用的DTD,你可以提供自己的EntityResolver 并测试引用的DTD,并决定是加载DTD文件的本地副本还是只返回null。



自定义EntityResolvers上引用答案的代码示例:

  builder.setEntityResolver(new EntityResolver(){
@Override
public InputSource resolveEntity(String publicId,String systemId)
抛出SAXException,IOException {
if(systemId.contains(foo.dtd)){
return new InputSource(new StringReader());
} else {
返回null;
}
}
});


I want to read an XML file that has a schema declaration in it.

And that's all I want to do, read it. I don't care if it's valid, but I want it to be well formed.

The problem is that the reader is trying to read the schema file, and failing.

I don't want it to even try.

I've tried disabling validation, but it still insists on trying to read the schema file.

Ideally, I'd like to do this with a stock Java 5 JDK.

Here's what I have so far, very simple:

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setValidating(false);
    DocumentBuilder db = dbf.newDocumentBuilder();
    Document doc = db.parse(file);

and here's the exception I am getting back:

java.lang.RuntimeException: java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd

Yes, this HAPPENS to be an XHTML schema, but this isn't an "XHTML" issue, it's an XML issue. Just pointing that out so folks don't get distracted. And, in this case, the W3C is basically saying "don't ask for this thing, it's a silly idea", and I agree. But, again, it's a detail of the issue, not the root of it. I don't want to ask for it AT ALL.

解决方案

The reference is not for Schema, but for a DTD.

DTD files can contain more than just structural rules. They can also contain entity references. XML parsers are obliged to load and parse DTD references, because they could contain entity references that might affect how the document is parsed and the content of the file(you could have an entity reference for characters or even whole phrases of text).

If you want to want to avoid loading and parsing the referenced DTD, you can provide your own EntityResolver and test for the referenced DTD and decide whether load a local copy of the DTD file or just return null.

Code sample from the referenced answer on custom EntityResolvers:

   builder.setEntityResolver(new EntityResolver() {
        @Override
        public InputSource resolveEntity(String publicId, String systemId)
                throws SAXException, IOException {
            if (systemId.contains("foo.dtd")) {
                return new InputSource(new StringReader(""));
            } else {
                return null;
            }
        }
    });

这篇关于如何在Java中读取格式良好的XML,但是跳过架构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆