如何更改 XmlReader 的字符编码 [英] How to change character encoding of XmlReader

查看:25
本文介绍了如何更改 XmlReader 的字符编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的 XmlReader:

I have a simple XmlReader:

XmlReader r = XmlReader.Create(fileName);

while (r.Read())
{
    Console.WriteLine(r.Value);
}

问题是,Xml 文件中包含 ISO-8859-9 字符,这使得 XmlReader 抛出Invalid character in the given encoding."异常.我可以通过在开头添加 <?xml version="1.0" encoding="ISO-8859-9" ?> 行来解决这个问题,但我想以另一种方式解决这个问题以防我无法修改源文件.如何更改 XmlReader 的编码?

The problem is, the Xml file has ISO-8859-9 characters in it, which makes XmlReader throw "Invalid character in the given encoding." exception. I can solve this problem with adding <?xml version="1.0" encoding="ISO-8859-9" ?> line in the beginning but I'd like to solve this in another way in case I can't modify the source file. How can I change the encoding of XmlReader?

推荐答案

要强制 .NET 将文件读入 ISO-8859-9,只需使用众多 XmlReader.Create 重载之一,例如

To force .NET to read the file in as ISO-8859-9, just use one of the many XmlReader.Create overloads, e.g.

using(XmlReader r = XmlReader.Create(new StreamReader(fileName, Encoding.GetEncoding("ISO-8859-9")))) {
    while(r.Read()) {
        Console.WriteLine(r.Value);
    }
}

但是,这可能不起作用,因为 IIRC,W3C XML 标准说明了何时读取 XML 声明行,兼容解析器应立即切换到 XML 声明中指定的编码,而不管它使用的是什么编码前.在您的情况下,如果 XML 文件没有 XML 声明,则编码将为 UTF-8,并且仍然会失败.我可能在这里胡说八道,所以试试看.:-)

However, that may not work because, IIRC, the W3C XML standard says something about when the XML declaration line has been read, a compliant parser should immediately switch to the encoding specified in the XML declaration regardless of what encoding it was using before. In your case, if the XML file has no XML declaration, the encoding will be UTF-8 and it will still fail. I may be talking nonsense here so try it and see. :-)

这篇关于如何更改 XmlReader 的字符编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆