Java、xml、XSLT:防止 DTD 验证 [英] Java, xml, XSLT: Prevent DTD-Validation

查看:26
本文介绍了Java、xml、XSLT:防止 DTD 验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 Java (6) XML-Api 对来自网络的 html 文档应用 xslt 转换.本文档是格式良好的 xhtml,因此包含有效的 DTD 规范 (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">).现在出现了一个问题:转换后 XSLT 处理器尝试下载 DTD,而 w3 服务器通过 HTTP 503 错误(由于 带宽限制 by w3).

I use the Java (6) XML-Api to apply a xslt transformation on a html-document from the web. This document is wellformed xhtml and so contains a valid DTD-Spec (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">). Now a problem occurs: Uppon transformation the XSLT-Processor tries to download the DTD and the w3-server denies this by a HTTP 503 error (due to Bandwith Limitation by w3).

如何防止 XSLT 处理器下载 dtd?我不需要验证我的输入文档.

来源是:

import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

--

   String xslt = "<?xml version="1.0"?>"+
   "<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">"+
   "    <xsl:output method="text" />"+          
   "    <xsl:template match="//html/body//div[@id='bodyContent']/p[1]"> "+
   "        <xsl:value-of select="." />"+
   "     </xsl:template>"+
   "     <xsl:template match="text()" />"+
   "</xsl:stylesheet>";

   try {
   Source xmlSource = new StreamSource("http://de.wikipedia.org/wiki/Right_Livelihood_Award");
   Source xsltSource = new StreamSource(new StringReader(xslt));
   TransformerFactory ft = TransformerFactory.newInstance();

   Transformer trans = ft.newTransformer(xsltSource);

   trans.transform(xmlSource, new StreamResult(System.out));
   }
   catch (Exception e) {
     e.printStackTrace();
   }

我在 SO 上阅读了以下问题,但它们都使用另一个 XML-Api:

I read the following quesitons here on SO, but they all use another XML-Api:

谢谢!

推荐答案

我最近在使用 JAXB 解组 XML 时遇到了这个问题.答案是从 XmlReader 和 InputSource 创建一个 SAXSource,然后将它传递给 JAXB UnMarshaller 的 unmarshal() 方法.为了避免加载外部 DTD,我在 XmlReader 上设置了一个自定义的 EntityResolver.

I recently had this issue while unmarshalling XML using JAXB. The answer was to create a SAXSource from an XmlReader and InputSource, then pass that to the JAXB UnMarshaller's unmarshal() method. To avoid loading the external DTD, I set a custom EntityResolver on the XmlReader.

SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xmlr = sp.getXMLReader();
xmlr.setEntityResolver(new EntityResolver() {
    public InputSource resolveEntity(String pid, String sid) throws SAXException {
        if (sid.equals("your remote dtd url here"))
            return new InputSource(new StringReader("actual contents of remote dtd"));
        throw new SAXException("unable to resolve remote entity, sid = " + sid);
    } } );
SAXSource ss = new SAXSource(xmlr, myInputSource);

正如所写的那样,如果这个自定义实体解析器被要求解析一个你希望它解析的实体以外的实体,它将抛出一个异常.如果您只是想让它继续加载远程实体,请删除throws"行.

As written, this custom entity resolver will throw an exception if it's ever asked to resolve an entity OTHER than the one you want it to resolve. If you just want it to go ahead and load the remote entity, remove the "throws" line.

这篇关于Java、xml、XSLT:防止 DTD 验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆