用DOM解析xml,DOCTYPE将被清除 [英] Parsing xml with DOM, DOCTYPE gets erased

查看:151
本文介绍了用DOM解析xml,DOCTYPE将被清除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



得到这个xml文件:

 <?xml version =1.0encoding =UTF-8standalone =yes?> 
<!DOCTYPE地图[<!ELEMENT map(station *)>
<!ATTLIST station id ID#REQUIRED> ] GT;
< favoris>
< station id =5> test1< / station>
< station id =6> test1< / station>
< station id =8> test1< / station>
< / favoris>

我的功能非常基本:

  public static void EditStationName(int id,InputStream is,String path,String name)throws ParserConfigurationException,SAXException,IOException,TransformerFactoryConfigurationError,TransformerException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

DocumentBuilder builder = factory.newDocumentBuilder();
文档dom = builder.parse(is);

元素e = dom。的getElementById(将String.valueOf(ID));
e.setTextContent(name);
//将DOM文档写入文件
Transformer xformer = TransformerFactory.newInstance()。newTransformer();
FileOutputStream fos = new FileOutputStream(path);
结果result = new StreamResult(fos);
源source = new DOMSource(dom);


xformer.setOutputProperty(
OutputKeys.STANDALONE,yes
);

xformer.transform(source,result);
}

它正在工作,但是doctype被删除!我只收到整个文档,但没有doctype部分,这对我很重要,因为它允许我通过id检索!
我们如何保持该类型?为什么会擦除它?
我尝试了许多解决方案与输出键例如或omImpl.createDocumentType,但没有一个这样工作...



谢谢!

解决方案

(此回复仅仅是@Grzegorz Szpetkowski的答案的补充,为什么它的工作)



您丢失了doctype定义,因为您使用生成XSL转换的 Transform 类。 XSLT树模型中没有 DOCTYPE 声明或docytype定义对象/节点。当解析器将文档交给XSLT处理器时,doctype信息丢失,因此无法保留或复制。 XSLT提供对输出树的序列化的一些控制,包括使用公共或系统标识符添加<!DOCTYPE ...> 声明。这些标识符的值需要预先知道,不能从输入树中读取。也不支持创建或保留嵌入式DTD或实体声明(尽管此障碍的一种解决方法是将其作为文本输出为 disable-output-escaping =yes



为了保护DTD,您需要使用XML序列化程序而不是XSL转换来输出文档,像Grzegorz已经建议的那样。


how come dom with java erases doctype when editing xml ?

got this xml file :

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE map[ <!ELEMENT map (station*) >
                <!ATTLIST station  id   ID    #REQUIRED> ]>
<favoris>
<station id="5">test1</station>
<station id="6">test1</station>
<station id="8">test1</station>
</favoris> 

my function is very basic :

public static void EditStationName(int id, InputStream is, String path, String name) throws ParserConfigurationException, SAXException, IOException, TransformerFactoryConfigurationError, TransformerException{
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

    DocumentBuilder builder = factory.newDocumentBuilder();
    Document dom = builder.parse(is);

    Element e = dom. getElementById(String.valueOf(id));
    e.setTextContent(name);
    // Write the DOM document to the file
    Transformer xformer = TransformerFactory.newInstance().newTransformer();
    FileOutputStream fos = new FileOutputStream(path);
    Result result = new StreamResult(fos);  
    Source source = new DOMSource(dom);


        xformer.setOutputProperty(
                OutputKeys.STANDALONE,"yes"     
                );

    xformer.transform(source, result);
}

it's working but the doctype gets erased ! and I just got the whole document but without the doctype part, which is important for me because it allows me to retrieve by id ! how can we keep the doctype ? why does it erase it? I tried many solution with outputkeys for example or omImpl.createDocumentType but none of these worked...

thank you !

解决方案

(This response is in a way only a supplement to @Grzegorz Szpetkowski's answer, why it works)

You lose the doctype definition because you use the Transform class which produces an XSL transformation. There is no DOCTYPE declaration or docytype definition object/node in XSLT tree model. When a parser hands over the document to an XSLT processor, the doctype info is lost and therefore cannot be retained or duplicated. XSLT offers some control over the serialization of the output tree, including adding an <!DOCTYPE ... > declaration with a public or system identifier. The values for these identifiers need to be known beforehand and cannot be read from the input tree. Creating or retaining an embedded DTD or entity declarations is also not supported (although one workaround for this obstacle is to output it as text with disable-output-escaping="yes").

In order to preserve the DTD you need to output your document with an XML serializer instead of XSL transformation, like Grzegorz already suggested.

这篇关于用DOM解析xml,DOCTYPE将被清除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆