用 C# 编写未转义的 XMLDocument [英] Writing XMLDocument unescaped in C#

查看:35
本文介绍了用 C# 编写未转义的 XMLDocument的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我正在 XmlDocument 中编写 XHTML.这很完美,但我遇到了一个问题.某些 XmlText 元素可以包含诸如   之类的内容.当我想将此类内容写入流时,它使用 innerXML 而不是此类节点的 innerText 值.问题是输出是错误的,因为现在它的输出是  而不是  .如何在写入流时使用 xmlwriter 和 xmldocument 而不执行此类转义?我只想要未转义的输出.

Currently I'm writing XHTML in a XmlDocument. This works perfect, but I'm stuck on one problem. Some XmlText elements can contain things like  . When I want to write such things to a stream it uses the innerXML instead of the innerText value for such nodes. The problem is that the ouput is wrong because now its outputting   instead of  . How can I use xmlwriter and xmldocument without performing such escaping when writing to a stream? I just want unescaped output.

推荐答案

几乎可以肯定,您在这里尝试解决的问题是错误的.如果你想要带有不间断空格的文本,那么你应该使用 non- 换行符.在 C# 字符串文字中,您可以将其编写为转义序列 \u00A0,例如:

You're almost certainly trying to solve the wrong problem here. If you want text with non-breaking spaces, then you should use the non-breaking space character. In a C# string literal you can write it as the escape sequence \u00A0, for example:

     var xmldoc = new XmlDocument();
     XmlElement test = xmldoc.CreateElement("test");
     xmldoc.AppendChild(test);
     XmlText nbsp = xmldoc.CreateTextNode("\u00A0");
     test.AppendChild(nbsp);

nbsp 这样的 HTML 实体只是在非 unicode 文本文件中对此类字符进行编码的一种方式.在构建 XML DOM 时不应该使用它们.顺便说一句,如果您强制 .NET 将上述 DOM 写入 ASCII 编码文件(通过适当的 XmlWriterSettings),那么它可能会将不间断空格字符写入  .在 UTF-8 编码文件(默认)中,它只会显示为一个空格.

HTML entities like nbsp are just a way to encode such characters in a non-unicode text file. You shouldn't be using them when constructing an XML DOM. By the way, if you force .NET to write the above DOM to an ASCII encoded file (via the proper XmlWriterSettings) then it will probably write the non-breaking space character as  . In an UTF-8 encoded file (the default) it will just appear as a space.

如果您强制某些文字字符序列出现在 XML 输出中,那么您可能会创建无效的 XML,而这些 XML 无法被符合的 XML 处理器加载.例如,尝试在空的 XmlDocument 中加载  .这将引发异常.公平地说,您可以声明这样的实体,XHTML 模式就是这样做的.但我希望你明白我的意思.

If you force certain literal character sequences to appear in the XML output, then you risk creating invalid XML that cannot be loaded by conforming XML processors. For example, try to load <test>&nbsp;</test> in an empty XmlDocument. This will throw an exception. To be fair, you can declare such entities, and the XHTML schema does so. But I hope you see my point.

edit:XmlDocument 的工作正常.如果它不会转义诸如 & 之类的字符.<> 那么您可能会创建无法再次加载的无效 XML.要在输出中强制使用 XML 实体,您应该使用 XmlDocument.CreateEntityReference.错误在于任何代码在 XmlText 节点中使用实体而不是生成 XmlEntityReference 节点.

edit: XmlDocument is doing it's job correctly. If it wouldn't escape characters such as & < > then you could create invalid XML that's impossible to load again. To force an XML entity in the output you should use XmlDocument.CreateEntityReference. The bug is in whatever code is using entities in XmlText nodes instead of generating XmlEntityReference nodes.

这篇关于用 C# 编写未转义的 XMLDocument的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆