找到任何文件的编码有效途径 [英] Effective way to find any file's Encoding

查看:94
本文介绍了找到任何文件的编码有效途径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是一个最常见的问题,这件事情是模糊的我,因为我现在不太多了。

Yes is a most frequent question, and this matter is vague for me and since i don't now much about it.

但我想一个很precise的方式找到一个文件的编码。
所以precise记事本++是。

But i would like a very precise way to find a files Encoding. So precise as Notepad++ is.

感谢。

推荐答案

StreamReader.CurrentEncoding 属性很少返回正确的文本文件编码为我。我有更大的成功确定文件的字节序,通过分析它的字节顺序标记(BOM):

The StreamReader.CurrentEncoding property rarely returns the correct text file encoding for me. I've had greater success determining a file's endianness, by analyzing its byte order mark (BOM):

/// <summary>
/// Determines a text file's encoding by analyzing its byte order mark (BOM).
/// Defaults to ASCII when detection of the text file's endianness fails.
/// </summary>
/// <param name="filename">The text file to analyze.</param>
/// <returns>The detected encoding.</returns>
public static Encoding GetEncoding(string filename)
{
    // Read the BOM
    var bom = new byte[4];
    using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read))
    {
        file.Read(bom, 0, 4);
    }

    // Analyze the BOM
    if (bom[0] == 0x2b && bom[1] == 0x2f && bom[2] == 0x76) return Encoding.UTF7;
    if (bom[0] == 0xef && bom[1] == 0xbb && bom[2] == 0xbf) return Encoding.UTF8;
    if (bom[0] == 0xff && bom[1] == 0xfe) return Encoding.Unicode; //UTF-16LE
    if (bom[0] == 0xfe && bom[1] == 0xff) return Encoding.BigEndianUnicode; //UTF-16BE
    if (bom[0] == 0 && bom[1] == 0 && bom[2] == 0xfe && bom[3] == 0xff) return Encoding.UTF32;
    return Encoding.ASCII;
}

作为一个方面说明,你可能要修改这个方法的最后一行返回 Encoding.Default 代替,因此编码为操作系统的当前ANSI $ C $ ç页面默认情况下返回。

As a side note, you may want to modify the last line of this method to return Encoding.Default instead, so the encoding for the OS's current ANSI code page is returned by default.

这篇关于找到任何文件的编码有效途径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆