在C ++中的大文件读取错误 [英] Big File reading error in C++

查看:158
本文介绍了在C ++中的大文件读取错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


10 5

1 2 3 4 1 5 1 5 2 1

所有值都以空格分隔。第一行的前两位分别是变量N和M,第二行的所有N值都需要放在一个名为S的数组中,大小为N.我写的代码对于像这样的文件没有问题,但是它涉及到真正的大文件与数以百万计,所以我需要它的工作不起作用。这里是代码

  int N,M; 
FILE * read = fopen(file.in,r);
fscanf(读取,%d%d,& N,& M);
int S [N]; (i = 0; i fscanf(read,%d,& S [i])为


我应该改变什么?

>解决方案 在获得数百万个整数的范围时,存在多个潜在的问题:


  • int 通常是32位,32位有符号整数的范围是-2 ^ 31到2 ^ 31 - 1,因此最大为2,147,483,647。您应该切换到一个64位的整数。

  • 不是标准C ++的长度数组(VLA)(它是标准C99,但是...有关于它是否是个好主意的讨论)。但重要的细节是VLA存储在堆栈中:32位 int 中的100万是4 MB,2百万是8 MB等等。你的默认堆栈大小,但它可能小于8 MB,因此你有堆栈溢出(你是在正确的网站上寻求帮助!)。
    >

    所以,让我们切换到C ++,并解决这些问题:

    $ p $ #包括< cstdint> // for int64_t
    #include< fstream>
    #include< vector>
    $ b int main(int argc,char * argv []){
    std :: ifstream stream(data.txt);

    int64_t n = 0,m = 0;
    流>> n>>米;

    std :: vector< int>数据;
    for(int64_t c = 0; c!= n; ++ c){
    int i = 0;
    流>>一世;
    data.push_back(i);


    //尽你所能:)
    }


    $ b $首先,我们使用< cstdint> 中的 int64_t 来消除整数溢出问题。其次,我们使用一个流(输入文件流: ifstream )来避免学习什么是与每个整型相关的格式(这是一个痛苦)。第三,我们使用 vector 来存储我们读取的数据,并且避免了堆栈溢出的问题。


    I need to read a file in c++ that has this specific format:

    10 5
    1 2 3 4 1 5 1 5 2 1

    All the values are separated with a space. The first 2 on the first line are the variables N and M respectively and all the N values from the second line need to be in an array called S with the size of N. The code I have written has no problem with files like these but it does not work when it comes to really big files with millions and so on that i need it to work with. Here is the code

    int N,M;
    FILE *read = fopen("file.in", "r");
    fscanf(read, "%d %d ", &N, &M);
    int S[N];
    for( i =0; i < N; i++){
        fscanf(read, "%d ", &S[i]);        
    }
    

    What should I change?

    解决方案

    There are multiple potential issues when getting in the range of millions of integers:

    • int is most often 32 bits, a 32 bits signed integer will have a range of -2^31 to 2^31 - 1, and thus the maximum of 2,147,483,647. You should switch to a 64 bits integral.

    • You are using int S[N] a Variable Length Array (VLA) which is not Standard C++ (it is Standard C99, but... there are discussions as to whether it was a good idea or not). The important detail, though, is that a VLA is stored on the stack: 1 million of 32 bits int is 4 MB, 2 millions is 8 MB, etc... check your default stack size, but it likely is less than 8 MB, and thus you have a stack-overflow (you're on the right site for help!).

    So, let's switch to C++ and do away with those issues:

    #include <cstdint> // for int64_t
    #include <fstream>
    #include <vector>
    
    int main(int argc, char* argv[]) {
       std::ifstream stream("data.txt");
    
       int64_t n = 0, m = 0;
       stream >> n >> m;
    
       std::vector<int> data;
       for (int64_t c = 0; c != n; ++c) {
           int i = 0;
           stream >> i;
           data.push_back(i);
       }
    
       // do your best :)
    }
    

    First of all, we use int64_t from <cstdint> to do away with the integer overflow issue. Second, we use a stream (input file stream: ifstream) to avoid having to learn what is the format associated with each and every integral type (it's a pain). Third, we use a vector to store the data we read, and do away with the stack overflow issue.

    这篇关于在C ++中的大文件读取错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆