C# 如何反序列化嵌入在文本中的 xml 标签? [英] C# how to deserialize an xml tag embedded in text?

查看:14
本文介绍了C# 如何反序列化嵌入在文本中的 xml 标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 XmlSerializer 反序列化 .NET 的 XML 文档注释的输出.作为参考,xml 文档的输出如下所示:

I am trying to deserialize the output of .NET's XML doc comment using an XmlSerializer. For reference, the output of xml documentation looks like:

<?xml version="1.0"?>
<doc>
    <assembly>
        <name>Apt.Lib.Data.Product</name>
    </assembly>
    <members>
        <member name="P:MyNamespace.MyType.MyProperty">
            <summary>See <see cref="T:MyNamespace.MyOthertype"/> for more info</summary>
        </member>
        ...
    </members>
</doc>

我用来生成序列化程序的对象是:

The object I'm using to generate the serializer is:

    [XmlRoot("doc")]
    public class XmlDocumentation
    {
        public static readonly XmlSerializer Serializer = new XmlSerializer(typeof(XmlDocumentation));

        [XmlElement("assembly")]
        public AssemblyName Assembly { get; set; }
        [XmlArray("members")]
        [XmlArrayItem("member")]
        public List<Member> Members { get; set; }

        public class AssemblyName
        {
            [XmlElement("name")]
            public string Name { get; set; }
        }

        public class Member
        {
            [XmlAttribute("name")]
            public string Name { get; set; }
            [XmlElement("summary")]
            public string Summary { get; set; }
        }
}

问题在于序列化程序遇到嵌入的 see cref 标记时.在这种情况下,序列化程序会抛出以下异常:

The problem is when the serializer encounters the embedded see cref tag. In that case the serializer throws the following exception:

System.InvalidOperationException : XML 文档中存在错误(147, 27).----> System.Xml.XmlException:意外的节点类型元素.ReadElementString 方法只能在元素上调用简单或空洞的内容.第 147 行,位置 27.

System.InvalidOperationException : There is an error in XML document (147, 27). ----> System.Xml.XmlException : Unexpected node type Element. ReadElementString method can only be called on elements with simple or empty content. Line 147, position 27.

如何在反序列化过程中将摘要标签的全部内容捕获为字符串?

How can I capture the entire content of the summary tag as a string during deserialization?

推荐答案

cref 标签本身包含非法字符.特别是 <, > 不能嵌入到 XML 元素的内容中.您应该在序列化或反序列化字符串之前对其进行清理.

The cref tag itself contains illegal characters. Specifically <, > can't be embedded in the contents of an XML element. You should sanitize the strings before they are serialized or deserialized.

如果您需要能够将特定规则应用于某些字符的转义或替换方式,您可以执行以下操作:

You can do something like this if you need to be able to apply specific rules to how certain characters are escaped or substituted:

    string ScrubString(string dirty)
    {
        char[] charArray = dirty.ToCharArray();
        StringBuilder strBldr = new StringBuilder(dirty.Length);

        for (int i = 0; i < charArray.Length; i++)
        {
           if(IsXmlSafe(charArray[i]))
           {
              strBldr.Append(charArray[i]);
           }
           else
           {
              //do something to escape or replace that character. 
           }
        }
        retrun strBldr.ToString();
    }


    bool IsXmlSafe(char c)
    {
       int charInt = Convert.ToInt32(c);

       return charInt == 9
           || charInt == 13
           || (charInt >= 32    && charInt <= 9728)
           || (charInt >= 9983  && charInt <= 55295)
           || (charInt >= 57344 && charInt <= 65533)
           || (charInt >= 65536 && charInt <= 1114111);
    }

您还可以使用此处的一些方法来使用正则表达式删除任何非法字符:

You can also use some of the approaches here to just remove any illegal character using regex:

XML 中的无效字符

这篇关于C# 如何反序列化嵌入在文本中的 xml 标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆