在将字符串加载到XML文档对象之前删除所有十六进制字符? [英] Remove all hexadecimal characters before loading string into XML Document Object?

查看:352
本文介绍了在将字符串加载到XML文档对象之前删除所有十六进制字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个xml字符串发布到服务器上的ashx处理程序。 xml字符串构建在客户端,并基于在表单上进行的几个不同条目。有时,某些用户会将其他来源的内容复制并粘贴到网络表单中。当我尝试使用 xmldoc.LoadXml(xmlStr)将xml字符串加载到 XMLDocument

I have an xml string that is being posted to an ashx handler on the server. The xml string is built on the client-side and is based on a few different entries made on a form. Occasionally some users will copy and paste from other sources into the web form. When I try to load the xml string into an XMLDocument object using xmldoc.LoadXml(xmlStr) I get the following exception:

System.Xml.XmlException = {"'', hexadecimal value 0x0B, is an invalid character. Line 2, position 1."}

在调试模式下,我可以看到胭脂字符(对不起,我不知道它的官方标题?):

In debug mode I can see the rouge character (sorry I'm not sure of it's official title?):

我的问题是如何清理xml字符串,我试图将其加载到XMLDocument对象?我需要一个自定义函数来逐个解析所有这些类型的字符,还是可以使用一些原生的.NET4类来删除它们?

My questions is how can I sanitise the xml string before I attempt to load it into the XMLDocument object? Do I need a custom function to parse out all these sorts of characters one-by-one or can I use some native .NET4 class to remove them?

推荐答案

这里有一个使用 Regex 清除xml无效字符的示例:

Here you have an example to clean xml invalid characters using Regex:

 xmlString = CleanInvalidXmlChars(xmlString);
 XmlDocument xmlDoc = new XmlDocument();
 xmlDoc.LoadXml(xmlString);

 public static string CleanInvalidXmlChars(string text)   
 {   
   string re = @"[^\x09\x0A\x0D\x20-\xD7FF\xE000-\xFFFD\x10000-x10FFFF]";   
   return Regex.Replace(text, re, "");   
 }   

这篇关于在将字符串加载到XML文档对象之前删除所有十六进制字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆