如何将大型文件(大于1 GB)的编码转换为Windows 1252,而不会出现内存不足的异常? [英] How do I convert encoding of a large file (>1 GB) in size - to Windows 1252 without an out-of-memory exception?

查看:122
本文介绍了如何将大型文件(大于1 GB)的编码转换为Windows 1252,而不会出现内存不足的异常?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑:

public static void ConvertFileToUnicode1252(string filePath, Encoding srcEncoding)
{
    try
    {
        StreamReader fileStream = new StreamReader(filePath);
        Encoding targetEncoding = Encoding.GetEncoding(1252);

        string fileContent = fileStream.ReadToEnd();
        fileStream.Close();

        // Saving file as ANSI 1252
        Byte[] srcBytes = srcEncoding.GetBytes(fileContent);
        Byte[] ansiBytes = Encoding.Convert(srcEncoding, targetEncoding, srcBytes);
        string ansiContent = targetEncoding.GetString(ansiBytes);

        // Now writes contents to file again
        StreamWriter ansiWriter = new StreamWriter(filePath, false);
        ansiWriter.Write(ansiContent);
        ansiWriter.Close();
        //TODO -- log success  details
    }
    catch (Exception e)
    {
        throw e;
        // TODO -- log failure details
    }
}

上述代码返回大文件的内存不足异常,仅适用于小型文件。

推荐答案

我认为仍然使用 StreamReader StreamWriter ,但读取字符块而不是一次或一行一行是最优雅的解决方案。它不是任意假设该文件由可管理长度的行组成,也不会与多字节字符编码中断。

I think still using a StreamReader and a StreamWriter but reading blocks of characters instead of all at once or line by line is the most elegant solution. It doesn't arbitrarily assume the file consists of lines of manageable length, and it also doesn't break with multi-byte character encodings.

public static void ConvertFileEncoding(string srcFile, Encoding srcEncoding, string destFile, Encoding destEncoding)
{
    using (var reader = new StreamReader(srcFile, srcEncoding))
    using (var writer = new StreamWriter(destFile, false, destEncoding))
    {
        char[] buf = new char[4096];
        while (true)
        {
            int count = reader.Read(buf, 0, buf.Length);
            if (count == 0)
                break;

            writer.Write(buf, 0, count);
        }
    }
}

(希望 StreamReader 有一个 CopyTo 方法,如 Stream ,如果有的话,这个将本质上是一个班轮!)

(I wish StreamReader had a CopyTo method like Stream does, if it had, this would be essentially a one-liner!)

这篇关于如何将大型文件(大于1 GB)的编码转换为Windows 1252,而不会出现内存不足的异常?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆