在C ++蟒蛇发电机等效的缓存读取 [英] Equivalent of a python generator in C++ for buffered reads

查看:209
本文介绍了在C ++蟒蛇发电机等效的缓存读取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

吉多·范罗苏姆展示的Python在这个简单的文章,并利用这个功能进行缓冲读取未知长度的文件:

Guido Van Rossum demonstrates the simplicity of Python in this article and makes use of this function for buffered reads of a file of unknown length:

def intsfromfile(f):
    while True:
        a = array.array('i')
        a.fromstring(f.read(4000))
        if not a:
            break
        for x in a:
            yield x

我需要做同样的事情在C ++的速度的原因!我有一个包含了,我需要合并无符号的64位整数排序的列表多个文件。我发现这个漂亮的一块 code 用于合并向量。

我坚持就如何使一个 ifstream的作为未知长度present本身就是一个文件作为向量可以愉快地重复一遍,直到年底该文件被达到。有什么建议么?难道我吠叫了正确的树与 istreambuf_iterator

I am stuck on how to make an ifstream for a file of unknown length present itself as a vector which can be happily iterated over until the end of the file is reached. Any suggestions? Am I barking up the correct tree with an istreambuf_iterator?

推荐答案

为了掩饰一个 ifstream的(或者说真的,任何输入流)的行为像一个表格一个迭代器,要使用 istream_iterator istreambuf_iterator 模板类。前者是为文件,其中的格式是一个令人关注的是有用的。例如,全空格分隔的整数的文件可以被读入向量的迭代器区间构造如下:

In order to disguise an ifstream (or really, any input stream) in a form that acts like an iterator, you want to use the istream_iterator or the istreambuf_iterator template class. The former is useful for files where the formatting is of concern. For example, a file full of whitespace-delimited integers can be read into the vector's iterator range constructor as follows:

#include <fstream>
#include <vector>
#include <iterator> // needed for istream_iterator

using namespace std;

int main(int argc, char** argv)
{
    ifstream infile("my-file.txt");

    // It isn't customary to declare these as standalone variables,
    // but see below for why it's necessary when working with
    // initializing containers.
    istream_iterator<int> infile_begin(infile);
    istream_iterator<int> infile_end;

    vector<int> my_ints(infile_begin, infile_end);

    // You can also do stuff with the istream_iterator objects directly:
    // Careful! If you run this program as is, this won't work because we
    // used up the input stream already with the vector.

    int total = 0;
    while (infile_begin != infile_end) {
        total += *infile_begin;
        ++infile_begin;
    }

    return 0;
}

istreambuf_iterator 用于阅读文件的单个字符的时间,不顾输入的格式。也就是说,它会回报你的所有字符,包括空格,换行字符,等等。根据您的应用程序,这可能是更合适的。

istreambuf_iterator is used to read through files a single character at a time, disregarding the formatting of the input. That is, it will return you all characters, including spaces, newline characters, and so on. Depending on your application, that may be more appropriate.

注:斯科特迈尔斯解释的有效STL 的,为什么上面需要独立的变量声明 istream_iterator 。通常情况下,你会做这样的事情:

Note: Scott Meyers explains in Effective STL why the separate variable declarations for istream_iterator are needed above. Normally, you would do something like this:

ifstream infile("my-file.txt");
vector<int> my_ints(istream_iterator<int>(infile), istream_iterator<int>());

不过,C ++实际上是在分析一个令人难以置信的怪异的方式,第二行。它把它看作这需要两个参数,并返回一个矢量&lt名为 my_ints函数的声明; INT&GT; 。第一个参数是类型 istream_iterator&LT的; INT&GT; 并命名为 INFILE (在括号混合被忽略)。第二个参数是一个函数指针没有名称发生(因​​为括号混合的)零参数和返回类型的对象 istream_iterator&LT; INT&GT;

However, C++ actually parses the second line in an incredibly bizarre way. It sees it as the declaration of a function named my_ints that takes in two parameters and returns a vector<int>. The first parameter is of type istream_iterator<int> and is named infile (the parantheses are ignored). The second parameter is a function pointer with no name that takes zero arguments (because of the parantheses) and returns an object of type istream_iterator<int>.

pretty的清凉,同时也pretty的加重,如果你没有看出来了。

Pretty cool, but also pretty aggravating if you're not watching out for it.

修改

下面是一个使用一个例子 istreambuf_iterator 读取的64位数字的文件奠定了终端到终端的:

Here's an example using the istreambuf_iterator to read in a file of 64-bit numbers laid out end-to-end:

#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>

using namespace std;

int main(int argc, char** argv)
{
    ifstream input("my-file.txt");
    istreambuf_iterator<char> input_begin(input);
    istreambuf_iterator<char> input_end;

    // Fill a char vector with input file's contents:
    vector<char> char_input(input_begin, input_end);
    input.close();

    // Convert it to an array of unsigned long with a cast:
    unsigned long* converted = reinterpret_cast<unsigned long*>(&char_input[0]);
    size_t num_long_elements = char_input.size() * sizeof(char) / sizeof(unsigned long);

    // Put that information into a vector:
    vector<unsigned long> long_input(converted, converted + num_long_elements);

    return 0;
}

现在,我个人很不喜欢这个解决方案(使用 reinter pret_cast ,露出 char_input 的数组),但我不与 istreambuf_iterator 不够熟悉而舒适地使用一个模板化的超过64位字符,这将使这个容易得多。

Now, I personally rather dislike this solution (using reinterpret_cast, exposing char_input's array), but I'm not familiar enough with istreambuf_iterator to comfortably use one templatized over 64-bit characters, which would make this much easier.

这篇关于在C ++蟒蛇发电机等效的缓存读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆