读取大的二进制文件(>可用的RAM) [英] Reading a large binary file (> available RAM)

查看:80
本文介绍了读取大的二进制文件(>可用的RAM)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用C#. Win 7上的NET 4.0,具有四核和16GB的RAM.

我有一个非常大的二进制文件(例如24GB).我希望将其分解以便阅读.

我有问题.有人可以帮忙吗?谢谢!


I am using C# . NET 4.0 on Win 7 with a quod core and 16GB of RAM.

I have a very large binary file (say 24GB). I wish to break it up so I can read it.

I am having problems. Can anyone help? Thanks!


private static int nChunks = 24;
byte[] chunks;
    for (int i = 0; i < nChunks; i++)
    {
        chunks = null;
        chunks = chunker.getMyChunks(i, newFileName, nChunks);
        doStuff(chunks);
    }
}







using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace myName
{
    public static class chunker
    {
               public static byte[] getMyChunks(int i, string filePath, int nChunks)
        {
            byte[] buffer;
            FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
            try
            {
                int length = (int)fileStream.Length;                // get total file length. This stays constant.

                long fLength = fileStream.Length / nChunks + 1;     // get the length of one of your chunks
                long readPositionFrom = fLength * i;
                long readPositionTo = fLength * (i+1);
                int lenToRead = Convert.ToInt32(readPositionTo - readPositionFrom);

                buffer = new byte[lenToRead];                               // Create buffer. This is what contains the data that is returned.
                int count;                                                  // Actual number of bytes that have been read
                int offset = Convert.ToInt32(readPositionFrom);             // Start reading from from this position in fileStream.

                // read until Read method returns 0 (end of the stream has been reached)
                // http://msdn.microsoft.com/en-us/library/system.io.filestream.read.aspx
                while ((count = fileStream.Read(buffer, offset, lenToRead - offset)) > 0)
                {
                    // first arg: buffer. the byte array being read into.
                    // second arg: offset. start populating the array at some element other than the first.
                    // third arg: (lenToRead-offset). The number of elements to read into the array

                    offset += count;  // sum is a buffer offset for next reading
                }
            }
            finally
            {
                fileStream.Close();
            }
            return buffer;
        }
    }
}





啊,是的,很好.

问题是我不知道问题出在哪里(否则我会解决的).

为避免进一步的混乱:我正尝试通过将其拆分来读取24GB.

我能够读取第一个块(即i = 0)并得到答案,但是由于我不知道文件的内容,因此我无法对其进行检查.

当代码进入第二个循环(i = 1)时,它在try语句上失败.

显然,这与我调用Read的方式有关.

任何帮助表示感谢.

谢谢!





ahh yes good point.

problem is that I dont know what the problem is (else I would have fixed it).

To avoid further confusion: I am trying to read a 24GB by breaking it up.

I am able to read the first chunk (ie i=0) and get an answer, however as I dont know what the contents of the file are I cant sanity check it .

When the code goes into the second loop (i=1) it fails on the try statement.

It is clearly something to do with the way I am calling Read.

Any help gratefully recieved.

Thanks!

推荐答案

乍一看,您计算了lenToRead = Convert.ToInt32(readPositionTo - readPositionFrom);,但还从lenToRead中减去了offset(readPositionFrom).因此,在第一次迭代之后,您请求FileStream.Read读取0个字节.我不知道它是否允许,我的猜测不是:)在任何情况下都是一个问题.

如果您仍然遇到问题,请随时评论有关此解决方案的详细信息,我将看看是否有任何想法:)
At first sight you calculate lenToRead = Convert.ToInt32(readPositionTo - readPositionFrom); but you also subtract offset (readPositionFrom) from lenToRead. So after the first iteration you request that FileStream.Read reads 0 bytes. I don''t know if it allows that or not, my guess is not :) in anycase that''s one problem.

if you continue to have a problem feel free to comment with details about it on this solution and I''ll see if I have any idea :)


这篇关于读取大的二进制文件(&gt;可用的RAM)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆