在Java中过滤非法XML字符 [英] Filtering illegal XML characters in Java
本文介绍了在Java中过滤非法XML字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
XML规范定义了XML文档中允许的Unicode字符子集:
http://www.w3.org/TR/REC-xml/#charsets 。
XML spec defines a subset of Unicode characters which are allowed in XML documents: http://www.w3.org/TR/REC-xml/#charsets.
如何从字符串中过滤掉这些字符在Java?
How do I filter out these characters from a String in Java?
简单测试用例:
Assert.equals("", filterIllegalXML(""+Character.valueOf((char) 2)))
推荐答案
找出XML的所有无效字符并非易事。你需要从Xerces调用或重新实现XMLChar.isInvalid(),
It's not trivial to find out all the invalid chars for XML. You need to call or reimplement the XMLChar.isInvalid() from Xerces,
http://kickjava.com/src/org/apache/xerces/util/XMLChar.java.htm
这篇关于在Java中过滤非法XML字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文