进行反向转义在.NET中使用XmlReader的XML实体? [英] Unescaping XML entities using XmlReader in .NET?

查看:299
本文介绍了进行反向转义在.NET中使用XmlReader的XML实体?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图反向转义的XML实体在一个字符串中的.NET(C#),但我似乎并没有得到它才能正常工作。

例如,如果我有字符串 AT&功放; amp; T公司,它应该被翻译为 AT& T公司

一种方法是使用HttpUtility.HtmlDe code(),但这是HTML。

所以,我有两个问题,关于这一点:

  1. 是否可以安全使用HttpUtility.HtmlDe code()为XML实体解码?

  2. 我如何使用的XmlReader(或类似的东西),要做到这一点?我曾尝试以下,但总是返回一个空字符串:

     静态字符串ReplaceEscapes(文本字符串)
    {
        StringReader读卡器=新StringReader(文本);
    
        XmlReaderSettings设置=新XmlReaderSettings();
    
        settings.ConformanceLevel = ConformanceLevel.Fragment;
    
        使用(的XmlReader的XMLReader = XmlReader.Create(阅读器,设置))
        {
            返回xmlReader.ReadString();
        }
    }
     

解决方案

您#2解决方案可以工作,但你需要调用 xmlReader.Read(); (或 xmlReader.MoveToContent(); )之前, ReadString

我猜#1将是可以接受的,即使有像&放那些边缘的情况下;章; 这是一个有效的HTML实体,但不是一个XML实体 - 什么都要你unescaper什么关系呢?抛出一个异常作为一个适当的XML解析器,或只返回®为HTML解析器会做什么?

I'm trying to unescape XML entities in a string in .NET (C#), but I don't seem to get it to work correctly.

For example, if I have the string AT&T, it should be translated to AT&T.

One way is to use HttpUtility.HtmlDecode(), but that's for HTML.

So I have two questions about this:

  1. Is it safe to use HttpUtility.HtmlDecode() for decoding XML entities?

  2. How do I use XmlReader (or something similar) to do this? I have tried the following, but that always returns an empty string:

    static string ReplaceEscapes(string text)
    {
        StringReader reader = new StringReader(text);
    
        XmlReaderSettings settings = new XmlReaderSettings();
    
        settings.ConformanceLevel = ConformanceLevel.Fragment;
    
        using (XmlReader xmlReader = XmlReader.Create(reader, settings))
        {
            return xmlReader.ReadString();
        }
    }
    

解决方案

Your #2 solution can work, but you need to call xmlReader.Read(); (or xmlReader.MoveToContent();) prior to ReadString.

I guess #1 would be also acceptable, even though there are those edge cases like ® which is a valid HTML entity, but not an XML entity – what should your unescaper do with it? Throw an exception as a proper XML parser, or just return "®" as the HTML parser would do?

这篇关于进行反向转义在.NET中使用XmlReader的XML实体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆