在Java中解析XML文件时如何忽略内联DTD [英] How to ignore inline DTD when parsing XML file in Java

查看:424
本文介绍了在Java中解析XML文件时如何忽略内联DTD的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在读取带有DTD声明的XML文件时遇到问题(外部声明已解决)。我正在使用SAX方法(javax.xml.parsers.SAXParser)。当没有DTD定义解析时,例如StartEement-Characters-StartElement-Characters-EndElement-Characters ......所以在Start或End元素之后立即调用了字符方法,这就是我需要的方法。当DTD在文件解析模式中时,更改为例如StartElement-StartElement-StartElement-Characters-EndEement-EndEement-EndEement。我需要在每个元素之后使用Characters方法。所以我问有没有办法阻止更改解析模式?

I have a problem reading a XML file with DTD declaration inside (external declaration is solved). I'm using SAX method (javax.xml.parsers.SAXParser). When there is no DTD definition parsing looks like for example StartEement-Characters-StartElement-Characters-EndElement-Characters...... So there is characters method called immediately after Start or End element and thats how I need it to be. When DTD is in file parsing schema changes to for example StartElement-StartElement-StartElement-Characters-EndEement-EndEement-EndEement. And I need Characters method after every element. So I'm asking is there any way to prevent change of parsing schema?

我的代码:

SAXParserFactory factory = SAXParserFactory.newInstance();   
factory.setValidating(false);  

SAXParser parser = factory.newSAXParser();
XMLReader reader = parser.getXMLReader(); 

reader.setFeature("http://xml.org/sax/features/validation", false);
reader.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
reader.setFeature("http://xml.org/sax/features/use-entity-resolver2", false);   
reader.setFeature("http://apache.org/xml/features/validation/unparsed-entity-checking", false);
reader.setFeature("http://xml.org/sax/features/resolve-dtd-uris", false);
reader.setFeature("http://apache.org/xml/features/validation/dynamic", false);
reader.setFeature("http://apache.org/xml/features/validation/schema/augment-psvi", false);

reader.parse(input);

我正在尝试解析XML文件链接(我的保管箱上的链接)。

There is XML file that I'm trying to parse link (its link on my dropbox).

推荐答案

我怀疑之前报告给 characters()回调的节点现在被报告给 ignorableWhitespace( )回调。最简单的解决方案可能是从 ignorableWhitespace()中调用 characters()

I suspect that the nodes that were previously being reported to the characters() callback are now being reported to the ignorableWhitespace() callback. The simplest solution might be to simply call characters() from ignorableWhitespace().

这是规范对 ignorableWhitespace()

This is what the spec has to say about ignorableWhitespace():


验证解析器必须使用此方法报告元素内容中
空格的每个块(参见 W3C XML 1.0推荐,
第2.10节
):非验证解析器如果$ b,也可以使用此方法$ b能够解析和使用内容模型。

Validating Parsers must use this method to report each chunk of whitespace in element content (see the W3C XML 1.0 recommendation, section 2.10): non-validating parsers may also use this method if they are capable of parsing and using content models.

换句话说,如果有DTD,并且你没有验证,然后
它是否由解析器决定是否使用字符()回调或$ b报告仅元素
内容模型中的空白$ b ignorableWhitespace()回调。

In other words, if there is a DTD, and if you are not validating, then it's up to the parser whether it reports whitespace in element-only content models using the characters() callback or the ignorableWhitespace() callback.

这篇关于在Java中解析XML文件时如何忽略内联DTD的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆