在C ++中的大文件读取错误 [英] Big File reading error in C++
问题描述
10 5
1 2 3 4 1 5 1 5 2 1
所有值都以空格分隔。第一行的前两位分别是变量N和M,第二行的所有N值都需要放在一个名为S的数组中,大小为N.我写的代码对于像这样的文件没有问题,但是它涉及到真正的大文件与数以百万计,所以我需要它的工作不起作用。这里是代码
int N,M;
FILE * read = fopen(file.in,r);
fscanf(读取,%d%d,& N,& M);
int S [N]; (i = 0; i fscanf(read,%d,& S [i])为
。
我应该改变什么?
-
int
通常是32位,32位有符号整数的范围是-2 ^ 31到2 ^ 31 - 1,因此最大为2,147,483,647。您应该切换到一个64位的整数。 不是标准C ++的长度数组(VLA)(它是标准C99,但是...有关于它是否是个好主意的讨论)。但重要的细节是VLA存储在堆栈中:32位
int
中的100万是4 MB,2百万是8 MB等等。你的默认堆栈大小,但它可能小于8 MB,因此你有堆栈溢出(你是在正确的网站上寻求帮助!)。>
所以,让我们切换到C ++,并解决这些问题:
$ p $ #包括< cstdint> // for int64_t
#include< fstream>
#include< vector>
$ b int main(int argc,char * argv []){
std :: ifstream stream(data.txt);
int64_t n = 0,m = 0;
流>> n>>米;
std :: vector< int>数据;
for(int64_t c = 0; c!= n; ++ c){
int i = 0;
流>>一世;
data.push_back(i);
//尽你所能:)
}
$ b $首先,我们使用< cstdint>
中的 int64_t
来消除整数溢出问题。其次,我们使用一个流(输入文件流: ifstream
)来避免学习什么是与每个整型相关的格式(这是一个痛苦)。第三,我们使用 vector
来存储我们读取的数据,并且避免了堆栈溢出的问题。
I need to read a file in c++ that has this specific format:
10 5
1 2 3 4 1 5 1 5 2 1
All the values are separated with a space. The first 2 on the first line are the variables N and M respectively and all the N values from the second line need to be in an array called S with the size of N. The code I have written has no problem with files like these but it does not work when it comes to really big files with millions and so on that i need it to work with. Here is the code
int N,M;
FILE *read = fopen("file.in", "r");
fscanf(read, "%d %d ", &N, &M);
int S[N];
for( i =0; i < N; i++){
fscanf(read, "%d ", &S[i]);
}
What should I change?
There are multiple potential issues when getting in the range of millions of integers:
int
is most often 32 bits, a 32 bits signed integer will have a range of -2^31 to 2^31 - 1, and thus the maximum of 2,147,483,647. You should switch to a 64 bits integral.You are using
int S[N]
a Variable Length Array (VLA) which is not Standard C++ (it is Standard C99, but... there are discussions as to whether it was a good idea or not). The important detail, though, is that a VLA is stored on the stack: 1 million of 32 bitsint
is 4 MB, 2 millions is 8 MB, etc... check your default stack size, but it likely is less than 8 MB, and thus you have a stack-overflow (you're on the right site for help!).
So, let's switch to C++ and do away with those issues:
#include <cstdint> // for int64_t
#include <fstream>
#include <vector>
int main(int argc, char* argv[]) {
std::ifstream stream("data.txt");
int64_t n = 0, m = 0;
stream >> n >> m;
std::vector<int> data;
for (int64_t c = 0; c != n; ++c) {
int i = 0;
stream >> i;
data.push_back(i);
}
// do your best :)
}
First of all, we use int64_t
from <cstdint>
to do away with the integer overflow issue. Second, we use a stream (input file stream: ifstream
) to avoid having to learn what is the format associated with each and every integral type (it's a pain). Third, we use a vector
to store the data we read, and do away with the stack overflow issue.
这篇关于在C ++中的大文件读取错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!