阅读使用C(大于4GB)采用读取功能的大型文件,导致问题 [英] Reading a large file using C (greater than 4GB) using read function, causing problems

查看:115
本文介绍了阅读使用C(大于4GB)采用读取功能的大型文件,导致问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我来写C code读取大文件。在code是如下:

I have to write C code for reading large files. The code is below:

int read_from_file_open(char *filename,long size)
{
    long read1=0;
    int result=1;
    int fd;
    int check=0;
    long *buffer=(long*) malloc(size * sizeof(int));
    fd = open(filename, O_RDONLY|O_LARGEFILE);
    if (fd == -1)
    {
       printf("\nFile Open Unsuccessful\n");
       exit (0);;
    }
    long chunk=0;
    lseek(fd,0,SEEK_SET);
    printf("\nCurrent Position%d\n",lseek(fd,size,SEEK_SET));
    while ( chunk < size )
    {
        printf ("the size of chunk read is  %d\n",chunk);
        if ( read(fd,buffer,1048576) == -1 )
        {
            result=0;
        }
        if (result == 0)
        {
            printf("\nRead Unsuccessful\n");
            close(fd);
            return(result);
        }

        chunk=chunk+1048576;
        lseek(fd,chunk,SEEK_SET);
        free(buffer);
    }

    printf("\nRead Successful\n");

    close(fd);
    return(result);
}

我现在面临这里的问题是,只要参数传递(尺寸参数)小于2.64亿个字节,它似乎能够读取。我正在块变量的增大尺寸,每个循环

The issue I am facing here is that as long as the argument passed (size parameter) is less than 264000000 bytes, it seems to be able to read. I am getting the increasing sizes of the chunk variable with each cycle.

当我路过2.64亿字节以上,读取失败,即:根据检查使用的读取返回-1。

When I pass 264000000 bytes or more, the read fails, i.e.: according to the check used read returns -1.

任何人都可以指出我为什么发生这种情况?我在正常模式下使用CC编译,不使用DD64。

Can anyone point me to why this is happening? I am compiling using cc in normal mode, not using DD64.

推荐答案

在首位,为什么还需要 lseek的()在你的周期? 阅读()将轮空的数量推进光标文件中读取。

In the first place, why do you need lseek() in your cycle? read() will advance cursor in file by the number of byes read.

和,正题:长,分别块,将有 2147483647 的最大值,任何更大的数字将变成负的,实际上

And, to the topic: long, and, respectively, chunk, will have maximum value of 2147483647, any greater number will become negative, actually.

您想要使用 off_t 来声明块: off_t块,和大小为size_t
这是最主要的原因, lseek的()失败。

You want to use off_t to declare chunk: off_t chunk, and size as size_t. That's the main reason why lseek() fails.

和,话又说回来了,其他人注意到,你不想免费()周期内的缓冲区。

And, then again, as other people noticed, you do not want to free() your buffer inside the cycle.

还请注意,您将覆盖您已经阅读数据。
此外,阅读()不一定会多读你问它,所以最好由字节数来推进块实际读取,而不是量字节你想读。

Note also that you will overwrite the data you have already read. Additionally, read() will not necessarily read as much as you have asked it, so it is better to advance chunk by the amount of the bytes actually read, rather than amount of bytes you want to read.

在拍摄方面的一切,正确的code可能要仰望类似的东西:

Taking everything in regards, the correct code probably shall look something like that:

// Edited: note comments after the code
#ifndef O_LARGEFILE
#define O_LARGEFILE 0
#endif

int read_from_file_open(char *filename,size_t size)
{
int fd;
long *buffer=(long*) malloc(size * sizeof(long));
fd = open(filename, O_RDONLY|O_LARGEFILE);
   if (fd == -1)
    {
       printf("\nFile Open Unsuccessful\n");
       exit (0);;
    }
off_t chunk=0;
lseek(fd,0,SEEK_SET);
printf("\nCurrent Position%d\n",lseek(fd,size,SEEK_SET));
while ( chunk < size )
  {
   printf ("the size of chunk read is  %d\n",chunk);
   size_t readnow;
   readnow=read(fd,((char *)buffer)+chunk,1048576);
   if (readnow < 0 )
     {
        printf("\nRead Unsuccessful\n");
        free (buffer);
        close (fd);
        return 0;
     }

   chunk=chunk+readnow;
  }

printf("\nRead Successful\n");

free(buffer);
close(fd);
return 1;

}

我也删除了结果变量和所有相关逻辑的自由,因为,我相信,它可以简化。

I also took a liberty of removing result variable and all related logic, since, I believe, it can be simplified.

编辑:我注意到,一些系统(最明显的是,BSD)没有 O_LARGEFILE ,因为不需要那里。所以,我已经加入了年初的#ifdef,这将使code更加便于携带。

I have noted that some systems (most notably, BSD) do not have O_LARGEFILE, since it is not needed there. So, I have added an #ifdef in the beginning, which would make code more portable.

这篇关于阅读使用C(大于4GB)采用读取功能的大型文件,导致问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆