如何写C#超快的文件流code? [英] How to write super-fast file-streaming code in C#?

查看:181
本文介绍了如何写C#超快的文件流code?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个巨大的文件分割成许多较小的文件。目的地文件中的每一个由偏移和长度的字节数来定义。我用下面的code:

 私人无效副本(字符串SRCFILE,串dstFile,诠释抵消,诠释长度)
{
    BinaryReader读卡器=新BinaryReader(File.OpenRead(SRCFILE));
    reader.BaseStream.Seek(偏移,SeekOrigin.Begin);
    字节[]缓冲= reader.ReadBytes(长度);    的BinaryWriter作家=新的BinaryWriter(File.OpenWrite(dstFile));
    writer.Write(缓冲液);
}

考虑到我要调用此函数约10万次,这是非常缓慢的。

<醇开始=2>
  • 有没有一种方法,使直接连接到读者的作家? (即,不实际加载的内容到缓冲存储器中。)


  • 解决方案

    我不相信有.NET中任何允许复制文件的一部分,而不在内存中缓存它。但是,这让我感到这是低效的,无论如何,因为它需要打开输入的文件和寻求多次。如果你的只是的分裂文件,为什么不打开输入文件一次,然后就写类似:

     公共静态无效CopySection(流输入字符串的TargetFile,INT长度)
    {
        字节[]缓冲区=新的字节[8192];    使用(流输出= File.OpenWrite(的TargetFile))
        {
            INT读取动作= 1;
            //这会悄悄地完成,如果我们无法读取长度字节。
            //另一种方法是抛出一个异常
            而(长度大于0&放大器;&放大器;读取动作大于0)
            {
                读取动作= input.Read(缓冲液,0,Math.Min(长度,buffer.Length));
                output.Write(缓冲液,0,读取动作);
                长度 - =读取动作;
            }
        }
    }

    这在每个调用创建一个缓冲区轻微低效率 - 您可能要一次创建缓冲区并传递到方法以及:

     公共静态无效CopySection(流输入字符串的TargetFile,
                                   INT长度,字节[]缓冲区)
    {
        使用(流输出= File.OpenWrite(的TargetFile))
        {
            INT读取动作= 1;
            //这会悄悄地完成,如果我们无法读取长度字节。
            //另一种方法是抛出一个异常
            而(长度大于0&放大器;&放大器;读取动作大于0)
            {
                读取动作= input.Read(缓冲液,0,Math.Min(长度,buffer.Length));
                output.Write(缓冲液,0,读取动作);
                长度 - =读取动作;
            }
        }
    }

    请注意,这也将关闭输出流(由于using语句),它原来的code没有。

    重要的一点是,这将更加有效地使用操作系统的文件缓冲,因为你重复使用相同的输入流,代替在开始重新打开该文件,然后寻求

    我的认为的它会显著快,但很明显,你需要尝试一下,看看...

    这假设,当然连续块。如果你需要跳过文件的位,你可以做到这一点从外部方法。另外,如果你正在写非常小的文件,你可能要优化这种情况太 - 做到这一点的最简单的方法很可能会引入<一个href=\"http://msdn.microsoft.com/en-us/library/system.io.bufferedstream.aspx\"><$c$c>BufferedStream包裹的输入流。

    I have to split a huge file into many smaller files. Each of the destination files is defined by an offset and length as the number of bytes. I'm using the following code:

    private void copy(string srcFile, string dstFile, int offset, int length)
    {
        BinaryReader reader = new BinaryReader(File.OpenRead(srcFile));
        reader.BaseStream.Seek(offset, SeekOrigin.Begin);
        byte[] buffer = reader.ReadBytes(length);
    
        BinaryWriter writer = new BinaryWriter(File.OpenWrite(dstFile));
        writer.Write(buffer);
    }
    

    Considering that I have to call this function about 100,000 times, it is remarkably slow.

    1. Is there a way to make the Writer connected directly to the Reader? (That is, without actually loading the contents into the Buffer in memory.)

    解决方案

    I don't believe there's anything within .NET to allow copying a section of a file without buffering it in memory. However, it strikes me that this is inefficient anyway, as it needs to open the input file and seek many times. If you're just splitting up the file, why not open the input file once, and then just write something like:

    public static void CopySection(Stream input, string targetFile, int length)
    {
        byte[] buffer = new byte[8192];
    
        using (Stream output = File.OpenWrite(targetFile))
        {
            int bytesRead = 1;
            // This will finish silently if we couldn't read "length" bytes.
            // An alternative would be to throw an exception
            while (length > 0 && bytesRead > 0)
            {
                bytesRead = input.Read(buffer, 0, Math.Min(length, buffer.Length));
                output.Write(buffer, 0, bytesRead);
                length -= bytesRead;
            }
        }
    }
    

    This has a minor inefficiency in creating a buffer on each invocation - you might want to create the buffer once and pass that into the method as well:

    public static void CopySection(Stream input, string targetFile,
                                   int length, byte[] buffer)
    {
        using (Stream output = File.OpenWrite(targetFile))
        {
            int bytesRead = 1;
            // This will finish silently if we couldn't read "length" bytes.
            // An alternative would be to throw an exception
            while (length > 0 && bytesRead > 0)
            {
                bytesRead = input.Read(buffer, 0, Math.Min(length, buffer.Length));
                output.Write(buffer, 0, bytesRead);
                length -= bytesRead;
            }
        }
    }
    

    Note that this also closes the output stream (due to the using statement) which your original code didn't.

    The important point is that this will use the operating system file buffering more efficiently, because you reuse the same input stream, instead of reopening the file at the beginning and then seeking.

    I think it'll be significantly faster, but obviously you'll need to try it to see...

    This assumes contiguous chunks, of course. If you need to skip bits of the file, you can do that from outside the method. Also, if you're writing very small files, you may want to optimise for that situation too - the easiest way to do that would probably be to introduce a BufferedStream wrapping the input stream.

    这篇关于如何写C#超快的文件流code?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆