SAXReader不重写字符 [英] SAXReader not re-ecape characters

查看:123
本文介绍了SAXReader不重写字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用dom4j读取一个XML文件。该文件如下所示:

  ... 
< Field>&#13;&#10 ;你好,世界...< / Field>
...

我用 SAXReader 成为文档。当我在节点上使用 getText()时,我获得followin String:

  \r\\\
hello,world ...

我做一些处理,然后使用 asXml()编写另一个文件。但是,原始文件中的字符不会像原始文件那样转义,导致使用该文件的外部系统出现错误。



我如何避免特殊字符,并具有&#13;&#10; 在写文件时

解决方案

你不容易。那些不是逃脱,它们是特征实体。它们是XML的基本组成部分。 Xerces对未分类实体有一些非常复杂的支持,但我怀疑它适用于这些,而不是在DTD中定义的物种。


I'm reading a XML file with dom4j. The file looks like this:

...
<Field>&#13;&#10; hello, world...</Field>
...

I read the file with SAXReader into a Document. When I use getText() on a the node I obtain the followin String:

\r\n hello, world...

I do some processing and then write another file using asXml(). But the characters are not escaped as in the original file which results in error in the external system which uses the file.

How can I escape the special character and have &#13;&#10; when writing the file?

解决方案

You cannot easily. Those aren't 'escapes', they are 'character entities'. They are a fundamental part of XML. Xerces has some very complex support for 'unparsed entities', but I doubt that it applies to these, as opposed to the species that are defined in a DTD.

这篇关于SAXReader不重写字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆