在C ++蟒蛇发电机等效的缓存读取 [英] Equivalent of a python generator in C++ for buffered reads
问题描述
吉多·范罗苏姆展示的Python在这个简单的文章,并利用这个功能进行缓冲读取未知长度的文件:
Guido Van Rossum demonstrates the simplicity of Python in this article and makes use of this function for buffered reads of a file of unknown length:
def intsfromfile(f):
while True:
a = array.array('i')
a.fromstring(f.read(4000))
if not a:
break
for x in a:
yield x
我需要做同样的事情在C ++的速度的原因!我有一个包含了,我需要合并无符号的64位整数排序的列表多个文件。我发现这个漂亮的一块 code 一>用于合并向量。
我坚持就如何使一个 ifstream的作为未知长度present本身就是一个文件作为向量可以愉快地重复一遍,直到年底该文件被达到。有什么建议么?难道我吠叫了正确的树与 istreambuf_iterator
I am stuck on how to make an ifstream for a file of unknown length present itself as a vector which can be happily iterated over until the end of the file is reached. Any suggestions? Am I barking up the correct tree with an istreambuf_iterator?
推荐答案
为了掩饰一个 ifstream的
(或者说真的,任何输入流)的行为像一个表格一个迭代器,要使用 istream_iterator
或 istreambuf_iterator
模板类。前者是为文件,其中的格式是一个令人关注的是有用的。例如,全空格分隔的整数的文件可以被读入向量的迭代器区间构造如下:
In order to disguise an ifstream
(or really, any input stream) in a form that acts like an iterator, you want to use the istream_iterator
or the istreambuf_iterator
template class. The former is useful for files where the formatting is of concern. For example, a file full of whitespace-delimited integers can be read into the vector's iterator range constructor as follows:
#include <fstream>
#include <vector>
#include <iterator> // needed for istream_iterator
using namespace std;
int main(int argc, char** argv)
{
ifstream infile("my-file.txt");
// It isn't customary to declare these as standalone variables,
// but see below for why it's necessary when working with
// initializing containers.
istream_iterator<int> infile_begin(infile);
istream_iterator<int> infile_end;
vector<int> my_ints(infile_begin, infile_end);
// You can also do stuff with the istream_iterator objects directly:
// Careful! If you run this program as is, this won't work because we
// used up the input stream already with the vector.
int total = 0;
while (infile_begin != infile_end) {
total += *infile_begin;
++infile_begin;
}
return 0;
}
istreambuf_iterator
用于阅读文件的单个字符的时间,不顾输入的格式。也就是说,它会回报你的所有字符,包括空格,换行字符,等等。根据您的应用程序,这可能是更合适的。
istreambuf_iterator
is used to read through files a single character at a time, disregarding the formatting of the input. That is, it will return you all characters, including spaces, newline characters, and so on. Depending on your application, that may be more appropriate.
注:斯科特迈尔斯解释的有效STL 的,为什么上面需要独立的变量声明 istream_iterator
。通常情况下,你会做这样的事情:
Note: Scott Meyers explains in Effective STL why the separate variable declarations for istream_iterator
are needed above. Normally, you would do something like this:
ifstream infile("my-file.txt");
vector<int> my_ints(istream_iterator<int>(infile), istream_iterator<int>());
不过,C ++实际上是在分析一个令人难以置信的怪异的方式,第二行。它把它看作这需要两个参数,并返回一个矢量&lt名为
my_ints函数的声明; INT&GT;
。第一个参数是类型 istream_iterator&LT的; INT&GT;
并命名为 INFILE
(在括号混合被忽略)。第二个参数是一个函数指针没有名称发生(因为括号混合的)零参数和返回类型的对象 istream_iterator&LT; INT&GT;
However, C++ actually parses the second line in an incredibly bizarre way. It sees it as the declaration of a function named my_ints
that takes in two parameters and returns a vector<int>
. The first parameter is of type istream_iterator<int>
and is named infile
(the parantheses are ignored). The second parameter is a function pointer with no name that takes zero arguments (because of the parantheses) and returns an object of type istream_iterator<int>
.
pretty的清凉,同时也pretty的加重,如果你没有看出来了。
Pretty cool, but also pretty aggravating if you're not watching out for it.
修改
下面是一个使用一个例子 istreambuf_iterator
读取的64位数字的文件奠定了终端到终端的:
Here's an example using the istreambuf_iterator
to read in a file of 64-bit numbers laid out end-to-end:
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;
int main(int argc, char** argv)
{
ifstream input("my-file.txt");
istreambuf_iterator<char> input_begin(input);
istreambuf_iterator<char> input_end;
// Fill a char vector with input file's contents:
vector<char> char_input(input_begin, input_end);
input.close();
// Convert it to an array of unsigned long with a cast:
unsigned long* converted = reinterpret_cast<unsigned long*>(&char_input[0]);
size_t num_long_elements = char_input.size() * sizeof(char) / sizeof(unsigned long);
// Put that information into a vector:
vector<unsigned long> long_input(converted, converted + num_long_elements);
return 0;
}
现在,我个人很不喜欢这个解决方案(使用 reinter pret_cast
,露出 char_input
的数组),但我不与 istreambuf_iterator
不够熟悉而舒适地使用一个模板化的超过64位字符,这将使这个容易得多。
Now, I personally rather dislike this solution (using reinterpret_cast
, exposing char_input
's array), but I'm not familiar enough with istreambuf_iterator
to comfortably use one templatized over 64-bit characters, which would make this much easier.
这篇关于在C ++蟒蛇发电机等效的缓存读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!