在XOM中解析XHTML文档时出现DTD下载错误 [英] DTD download error while parsing XHTML document in XOM

查看:143
本文介绍了在XOM中解析XHTML文档时出现DTD下载错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析一个HTML文档,其中声明的doctype使用
过渡dtd,如下所示:

I am trying to parse an HTML document with the doctype declared to use the transitional dtd as follows:

<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // EN
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd >

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

当我做Builder时。构建在文档上,我得到以下异常:

When I do Builder.build on the document, I get the following exception:

  java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
       at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1305)
       at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
       at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
       at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source)
       at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
       at nu.xom.Builder.build(Builder.java:1127)
       at nu.xom.Builder.build(Builder.java:1019)

如果我删除doc类型声明,它解析得很好。我可以
成功从我的浏览器下载dtd,它告诉我
url是有效的。我不想删除doc类型声明。
有没有办法告诉建筑商不要下载dtd或者用备用dtd提供

If I remove the doc type declaration, it parses just fine. I can successfully download the dtd from my browser, which tells me that the url is valid. I don't want to remove the doc type declaration. Is there a way tell the builder not to download the dtd or provide it with an alternate dtd?

推荐答案

快速查看 Builder ,我想你可以提供一个 EntityResolver 通过构造函数获取 XMLReader 。我会尽量避免让解析器从互联网上下载文件。

Taking a quick look at the javadoc for Builder, I guess you could provide an EntityResolver via the constructor that takes a XMLReader. I would avoid letting the parser download files from the internet where possible.

这篇关于在XOM中解析XHTML文档时出现DTD下载错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆