安全地阅读大文件 [英] safely reading large files

查看:62
本文介绍了安全地阅读大文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

C ++如何安全地打开并读取非常大的文件?比如说,我说
有1GB的物理内存,我打开一个4GB的文件并尝试读取

就像这样:


#include< iostream>

#include< fstream>

#include< string>

using namespace std;


int main(){

字符串行;

ifstream myfile(" example.txt",ios :: binary);

if(myfile.is_open())

{

while(!myfile.eof())

{

getline(myfile,line);

cout<< line<<结束;

}

myfile.close();

}


其他cout< < 无法打开文件;


返回0;

}


特别是,如果是文件中的行数超过了可用物理内存的
量?会发生什么?看起来getline()会因为b $ b导致崩溃。有没有更好的办法。也许......检查免费金额

内存,然后使用该金额的10%左右进行读取。因此,如果1GB的
内存是免费的,那么文件IO需要100MB。如果只有10MB是免费的,那么
则一次读取1MB。重复此步骤,直到文件已完全读取
。是否内置于标准C ++中来处理这个问题?

或者有可接受的方法吗?


谢谢,


Brad

How does C++ safely open and read very large files? For example, say I
have 1GB of physical memory and I open a 4GB file and attempt to read
it like so:

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main () {
string line;
ifstream myfile ("example.txt", ios::binary);
if (myfile.is_open())
{
while (! myfile.eof() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}

else cout << "Unable to open file";

return 0;
}

In particular, what if a line in the file is more than the amount of
available physical memory? What would happen? Seems getline() would
cause a crash. Is there a better way. Maybe... check amount of free
memory, then use 10% or so of that amount for the read. So if 1GB of
memory is free, then take 100MB for file IO. If only 10MB is free,
then just read 1MB at a time. Repeat this step until the file has been
read completely. Is something built into standard C++ to handle this?
Or is there a accepted way to do this?

Thanks,

Brad

推荐答案

** *****@gmail.com 写道:

C ++如何安全地打开并读取非常大的文件?比如说,我说
有1GB的物理内存,我打开一个4GB的文件并尝试读取

就像这样:


#include< iostream>

#include< fstream>

#include< string>

using namespace std;


int main(){

字符串行;

ifstream myfile(" example.txt",ios :: binary);

if(myfile.is_open())

{

while(!myfile.eof())

{

getline(myfile,line);

cout<< line<<结束;

}

myfile.close();

}


其他cout< < 无法打开文件;


返回0;

}


特别是,如果是文件中的行数超过了可用物理内存的
量?会发生什么?看起来getline()会因为b $ b导致崩溃。有没有更好的办法。也许......检查免费金额

内存,然后使用该金额的10%左右进行读取。因此,如果1GB的
内存是免费的,那么文件IO需要100MB。如果只有10MB是免费的,那么
则一次读取1MB。重复此步骤,直到文件已完全读取
。是否内置到标准C ++中来处理这个问题?

或者有可接受的方法吗?
How does C++ safely open and read very large files? For example, say I
have 1GB of physical memory and I open a 4GB file and attempt to read
it like so:

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main () {
string line;
ifstream myfile ("example.txt", ios::binary);
if (myfile.is_open())
{
while (! myfile.eof() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}

else cout << "Unable to open file";

return 0;
}

In particular, what if a line in the file is more than the amount of
available physical memory? What would happen? Seems getline() would
cause a crash. Is there a better way. Maybe... check amount of free
memory, then use 10% or so of that amount for the read. So if 1GB of
memory is free, then take 100MB for file IO. If only 10MB is free,
then just read 1MB at a time. Repeat this step until the file has been
read completely. Is something built into standard C++ to handle this?
Or is there a accepted way to do this?



实际上,执行可能导致内存不足的操作

根本不是一件简单的事情。是的,如果你可以估算出你现在想要分配的内存数量,那么你需要知道可用内存的大小,那么你可以分配一个块

并对该块进行操作直到完成并转移到下一个块。

在好日子里,我们如何解决大型系统线性

方程式,一次一个矩阵(如果算法需要两个b / b
)。


遗憾的是,没有一个直接的解决方案。在大多数

的情况下,你甚至不知道你的内存耗尽,直到

为时已晚。您可以使用C ++异常编写程序来处理这些情况

。伪代码可能如下所示:


std :: size_t chunk_size = 1024 * 1024 * 1024;

MyAlgorithgm算法;


做{

尝试{

algo.prepare_the_operation(chunk_size);

//如果我在这里,chunk_size没关系

algo.perform_the_operation();

algo.wrap_it_up();

}

catch(std :: bad_alloc& e){

chunk_size / = 2; //或任何其他调整

}

}

while(chunk_size 1024 * 1024); //或者其他一些门槛


那样如果你的准备失败了,你只需要用较小的

块重新启动它,直到你完成操作或你的大块太多了

小而且你真的什么也做不了......


V

-

请在通过电子邮件回复时删除资金''A'

我没有回复最热门的回复,请不要问

Actually, performing operations that can lead to running out of memory
is not a simple thing at all. Yes, if you can estimate the amount of
memory you will need over what you right now want to allocate and you
know the size of available memory somehow, then you can allocate a chunk
and operate on that chunk until done and move over to the next chunk.
In the good ol'' days that''s how we solved large systems of linear
equations, one piece of the matrix at a time (or two if the algorithm
called for it).

Unfortunately there is no single straightforward solution. In most
cases you don''t even know that you''re going to run out of memory until
it''s too late. You can write the program to handle those situations
using C++ exceptions. The pseudo-code might look like this:

std::size_t chunk_size = 1024*1024*1024;
MyAlgorithgm algo;

do {
try {
algo.prepare_the_operation(chunk_size);
// if I am here, the chunk_size is OK
algo.perform_the_operation();
algo.wrap_it_up();
}
catch (std::bad_alloc & e) {
chunk_size /= 2; // or any other adjustment
}
}
while (chunk_size 1024*1024); // or some other threshold

That way if your preparation fails, you just restart it using a smaller
chunk, until you either complete the operation or your chunk is too
small and you can''t really do anything...

V
--
Please remove capital ''A''s when replying by e-mail
I do not respond to top-posted replies, please don''t ask


by*******@gmail.com 写道:
by*******@gmail.com writes:

while(!myfile.eof())

{

getline(myfile,line);

cout<< line<<结束;

}

myfile.close();

}


其他cout< < 无法打开文件;


返回0;

}


特别是,如果是文件中的行数超过了可用物理内存的
量?
while (! myfile.eof() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}

else cout << "Unable to open file";

return 0;
}

In particular, what if a line in the file is more than the amount of
available physical memory?



C ++库将无法分配足够的内存,抛出异常,
并终止进程。

The C++ library will fail to allocate sufficient memory, throw an exception,
and terminate the process.


会发生什么?看起来getline()会因为b $ b导致崩溃。有没有更好的办法。也许......检查免费金额

内存,然后使用该金额的10%左右进行读取。
What would happen? Seems getline() would
cause a crash. Is there a better way. Maybe... check amount of free
memory, then use 10% or so of that amount for the read.



你究竟在哪里建议存储剩余的90%

std :: string?

And where exactly would you propose to store the remaining 90% of
std::string?


完全阅读。是否内置到标准C ++中来处理这个问题?
read completely. Is something built into standard C++ to handle this?



编号std :: getline()根据需要读取,直到行尾

字符,以及结果std :: string必须足够大才能存储

整行。

No. std::getline() reads as much as necessary, until the end-of-line
character, and the resulting std::string has to be big enough to store the
entire line.


或者有可接受的方法吗?
Or is there a accepted way to do this?



如果你需要一种方法来处理这种情况,不会使用std :: getline(),

但是有一些不同的方法,比如std :: istream :: get或std :: istream :: read。


----- BEGIN PGP SIGNATURE -----

版本:GnuPG v1.4.7(GNU / Linux)

iD8DBQBIM4apx9p3GYHlUOIRAj5GAJ44mxrzR4cPoup + rW7WzL 94IMzAyQCeNHRL

nfDgBM3Esmf5hadFB9LuptE =

= nO85

----- END PGP SIGNATURE -----

If you need a way to handle this situation, would not use std::getline(),
but some different approach, like std::istream::get or std::istream::read.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBIM4apx9p3GYHlUOIRAj5GAJ44mxrzR4cPoup+rW7WzL 94IMzAyQCeNHRL
nfDgBM3Esmf5hadFB9LuptE=
=nO85
-----END PGP SIGNATURE-----


由******* @ gmail.com 写道:

C ++如何安全地打开并读取非常大的内容文件?比如说,我说
有1GB的物理内存,我打开一个4GB的文件并尝试读取

就像这样:
How does C++ safely open and read very large files? For example, say I
have 1GB of physical memory and I open a 4GB file and attempt to read
it like so:



其他人已经回答了你的问题,所以我要解决其他问题。

Others have already answered your question, so I''m going to address
something else.


#include < iostream>

#include< fstream>

#include< string>

using namespace std;


int main(){

string line;

ifstream myfile(" example.txt",ios :: binary);

if(myfile.is_open())

{
#include <iostream>
#include <fstream>
#include <string>
using namespace std;

int main () {
string line;
ifstream myfile ("example.txt", ios::binary);
if (myfile.is_open())
{



这个while循环没有你想象的那样做。见FAQ 15.5

http://parashift.com/c++-faq-lite/in....html#faq-15.5


while(! myfile.eof())

{

getline(myfile,line);

cout<< line<<结束;

}

myfile.close();

}


其他cout< < 无法打开文件;


返回0;

}
while (! myfile.eof() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}

else cout << "Unable to open file";

return 0;
}


这篇关于安全地阅读大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆