std :: ifstream缓冲区缓存 [英] std::ifstream buffer caching

查看:1878
本文介绍了std :: ifstream缓冲区缓存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的应用程序,我试图合并排序文件(保持他们排序当然),所以我必须通过每个元素在两个文件中写入最小的第三个。这对大文件工作相当慢,就我没有看到任何其他选择(迭代必须做)我试图优化文件加载。我可以使用一些数量的RAM,我可以使用缓冲。我的意思是,每次我可以读取一次像100Mb,然后使用那个缓冲区,直到在缓冲区中没有元素,然后我会重新填充缓冲区,从两个文件读取4个字节。但我想ifstream已经这样做,它会给我更多的性能,有什么原因吗?如果fstream是,也许我可以改变缓冲区的大小?

In my application I'm trying to merge sorted files (keeping them sorted of course), so I have to iterate through each element in both files to write the minimal to the third one. This works pretty much slow on big files, as far as I don't see any other choice (the iteration has to be done) I'm trying to optimize file loading. I can use some amount of RAM, which I can use for buffering. I mean instead of reading 4 bytes from both files every time I can read once something like 100Mb and work with that buffer after that, until there will be no element in buffer, then I'll refill the buffer again. But I guess ifstream is already doing that, will it give me more performance and is there any reason? If fstream does, maybe I can change size of that buffer?

添加

我的当前代码看起来像(伪代码)

My current code looks like that (pseudocode)

// this is done in loop
int i1 = input1.read_integer();
int i2 = input2.read_integer();
if (!input1.eof() && !input2.eof())
{
   if (i1 < i2)
   {
      output.write(i1);
      input2.seek_back(sizeof(int));
   } else
      input1.seek_back(sizeof(int));
      output.write(i2);
   }
} else {
   if (input1.eof())
      output.write(i2);
   else if (input2.eof())
      output.write(i1);
}

这里不喜欢的是


  • seek_back - 我必须回到上一个位置,因为没有办法偷看4个字节

  • / li>
  • 如果其中一个流在EOF中,它仍然继续检查该流,而不是将另一个流的内容直接输出到输出,但这不是一个大问题,因为块大小几乎总是等于。

您可以建议改进吗?

谢谢。 / p>

Thanks.

推荐答案

在不涉及流缓冲区的讨论,你可以摆脱 seek_back ,通常使代码更简单:

Without getting into the discussion on stream buffers, you can get rid of the seek_back and generally make the code much simpler by doing:

using namespace std;
merge(istream_iterator<int>(file1), istream_iterator<int>(),
           istream_iterator<int>(file2), istream_iterator<int>(),
           ostream_iterator<int>(cout));



编辑:



/ p>

Added binary capability

#include <algorithm>
#include <iterator>
#include <fstream>
#include <iostream>

struct BinInt
{
    int value;
    operator int() const { return value; }
    friend std::istream& operator>>(std::istream& stream, BinInt& data)
    {
        return stream.read(reinterpret_cast<char*>(&data.value),sizeof(int));
    }
};

int main()
{
    std::ifstream   file1("f1.txt");
    std::ifstream   file2("f2.txt");

    std::merge(std::istream_iterator<BinInt>(file1), std::istream_iterator<BinInt>(),
               std::istream_iterator<BinInt>(file2), std::istream_iterator<BinInt>(),
               std::ostream_iterator<int>(std::cout));
}

这篇关于std :: ifstream缓冲区缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆