在Java中过滤非法XML字符 [英] Filtering illegal XML characters in Java

查看:588
本文介绍了在Java中过滤非法XML字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

XML规范定义了XML文档中允许的Unicode字符子集:
http://www.w3.org/TR/REC-xml/#charsets

XML spec defines a subset of Unicode characters which are allowed in XML documents: http://www.w3.org/TR/REC-xml/#charsets.

如何从字符串中过滤掉这些字符在Java?

How do I filter out these characters from a String in Java?

简单测试用例:

  Assert.equals("", filterIllegalXML(""+Character.valueOf((char) 2)))


推荐答案

找出XML的所有无效字符并非易事。你需要从Xerces调用或重新实现XMLChar.isInvalid(),

It's not trivial to find out all the invalid chars for XML. You need to call or reimplement the XMLChar.isInvalid() from Xerces,

http://kickjava.com/src/org/apache/xerces/util/XMLChar.java.htm

这篇关于在Java中过滤非法XML字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆