Java,xml,XSLT:防止DTD验证 [英] Java, xml, XSLT: Prevent DTD-Validation

查看:87
本文介绍了Java,xml,XSLT:防止DTD验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Java(6)XML-Api对来自Web的html文档应用xslt转换。这个文件格式正确,因此包含一个有效的DTD-Spec(<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp://www.w3。组织/ TR / XHTML1 / DTD / XHTML1-transitional.dtd> )。
现在出现问题:Uppon转换XSLT-Processor尝试下载DTD并且w3-server通过HTTP 503错误拒绝这一点(由于 Bandwith限制

I use the Java (6) XML-Api to apply a xslt transformation on a html-document from the web. This document is wellformed xhtml and so contains a valid DTD-Spec (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">). Now a problem occurs: Uppon transformation the XSLT-Processor tries to download the DTD and the w3-server denies this by a HTTP 503 error (due to Bandwith Limitation by w3).

如何阻止XSLT-Processor下载dtd?我不需要我的输入文档验证。

来源是:

import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

-

   String xslt = "<?xml version=\"1.0\"?>"+
   "<xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\">"+
   "    <xsl:output method=\"text\" />"+          
   "    <xsl:template match=\"//html/body//div[@id='bodyContent']/p[1]\"> "+
   "        <xsl:value-of select=\".\" />"+
   "     </xsl:template>"+
   "     <xsl:template match=\"text()\" />"+
   "</xsl:stylesheet>";

   try {
   Source xmlSource = new StreamSource("http://de.wikipedia.org/wiki/Right_Livelihood_Award");
   Source xsltSource = new StreamSource(new StringReader(xslt));
   TransformerFactory ft = TransformerFactory.newInstance();

   Transformer trans = ft.newTransformer(xsltSource);

   trans.transform(xmlSource, new StreamResult(System.out));
   }
   catch (Exception e) {
     e.printStackTrace();
   }

我在这里阅读了以下问题,但它们都使用了另一个XML- Api:

I read the following quesitons here on SO, but they all use another XML-Api:

  • "DTD download error while parsing XHTML document in XOM"

谢谢!

推荐答案

我最近在使用JAXB解组XML时遇到了这个问题。答案是从XmlReader和InputSource创建一个SAXSource,然后将其传递给JAXB UnMarshaller的unmarshal()方法。为了避免加载外部DTD,我在XmlReader上设置了一个自定义EntityResolver。

I recently had this issue while unmarshalling XML using JAXB. The answer was to create a SAXSource from an XmlReader and InputSource, then pass that to the JAXB UnMarshaller's unmarshal() method. To avoid loading the external DTD, I set a custom EntityResolver on the XmlReader.

SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xmlr = sp.getXMLReader();
xmlr.setEntityResolver(new EntityResolver() {
    public InputSource resolveEntity(String pid, String sid) throws SAXException {
        if (sid.equals("your remote dtd url here"))
            return new InputSource(new StringReader("actual contents of remote dtd"));
        throw new SAXException("unable to resolve remote entity, sid = " + sid);
    } } );
SAXSource ss = new SAXSource(xmlr, myInputSource);

如上所述,如果要求解析实体以外的其他实体解析器将抛出异常你想要它解决的那个。如果您只是希望它继续并加载远程实体,请删除throws行。

As written, this custom entity resolver will throw an exception if it's ever asked to resolve an entity OTHER than the one you want it to resolve. If you just want it to go ahead and load the remote entity, remove the "throws" line.

这篇关于Java,xml,XSLT:防止DTD验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆