C / C ++系统便携式方式更改打开文件的最大数量 [英] C/C++ System portable way to change maximum number of open files
问题描述
我有一个C ++程序转置一个非常大的矩阵。矩阵太大,不能保存在内存中,所以我将每列写入一个单独的临时文件,然后在处理完整个矩阵后连接临时文件。但是,我现在发现我正在运行,有过多的打开临时文件的问题(即操作系统不允许我打开足够的临时文件)。有没有系统可移植的方法来检查(并希望改变)允许的打开文件的最大数量?
I have a C++ program that transposes a very large matrix. The matrix is too large to hold in memory, so I was writing each column to a separate temporary file, and then concatenating the temporary files once the whole matrix has been processed. However, I am now finding that I am running up against the problem of having too many open temporary files (i.e. the OS doesn't allow me to open enough temporary files). Is there a system portable method for checking (and hopefully changing) the maximum number of allowed open files?
我意识到我可以关闭每个临时文件,但我担心这样做的性能影响。
I realise I could close each temp file and reopen only when needed, but am worried about the performance impact of doing this.
我的代码工作如下(伪代码 - 不保证工作):
My code works as follows (pseudocode - not guaranteed to work):
int Ncol=5000; // For example - could be much bigger.
int Nrow=50000; // For example - in reality much bigger.
// Stage 1 - create temp files
vector<ofstream *> tmp_files(Ncol); // Vector of temp file pointers.
vector<string> tmp_filenames(Ncol); // Vector of temp file names.
for (unsigned int ui=0; ui<Ncol; ui++)
{
string filename(tmpnam(NULL)); // Get temp filename.
ofstream *tmp_file = new ofstream(filename.c_str());
if (!tmp_file->good())
error("Could not open temp file.\n"); // Call error function
(*tmp_file) << "Column" << ui;
tmp_files[ui] = tmp_file;
tmp_filenames[ui] = filename;
}
// Stage 2 - read input file and write each column to temp file
ifstream input_file(input_filename.c_str());
for (unsigned int s=0; s<Nrow; s++)
{
int input_num;
ofstream *tmp_file;
for (unsigned int ui=0; ui<Ncol; ui++)
{
input_file >> input_num;
tmp_file = tmp_files[ui]; // Get temp file pointer
(*tmp_file) << "\t" << input_num; // Write entry to temp file.
}
}
input_file.close();
// Stage 3 - concatenate temp files into output file and clean up.
ofstream output_file("out.txt");
for (unsigned int ui=0; ui<Ncol; ui++)
{
string tmp_line;
// Close temp file
ofstream *tmp_file = tmp_files[ui];
(*tmp_file) << endl;
tmp_file->close();
// Read from temp file and write to output file.
ifstream read_file(tmp_filenames[ui].c_str());
if (!read_file.good())
error("Could not open tmp file for reading."); // Call error function
getline(read_file, tmp_line);
output_file << tmp_line << endl;
read_file.close();
// Delete temp file.
remove(tmp_filenames[ui].c_str());
}
output_file.close();
非常感谢!
Adam
推荐答案
至少有两个限制:
- 操作系统可能施加限制;在Unix(sh,bash和类似shell)中,使用
ulimit
在sysadmin允许的范围内更改限制 - C库实现也可能有一个限制;您可能需要重新编译库以更改
- the operating system may impose a limit; in Unix (sh, bash, and similar shells), use
ulimit
to change the limit, within the bounds allowed by the sysadmin - the C library implementation may have a limit as well; you'll probably need to recompile the library to change that
更好的解决方案是避免打开这么多文件。在我自己的程序中,我写了一个包装文件抽象(这是在Python中,但原理是相同的C),它跟踪每个文件中的当前文件位置,并打开/关闭文件根据需要,保留当前打开的文件池。
A better solution is to avoid having so many open files. In one of my own programs, I wrote a wrapper around the file abstraction (this was in Python, but the principle is the same in C), which keeps track of the current file position in each file, and opens/closes files as needed, keeping a pool of currently-open files.
这篇关于C / C ++系统便携式方式更改打开文件的最大数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!