在C ++中加快从文件中读取整数的速度 [英] Speed up integer reading from file in C++
问题描述
我正在逐行读取文件,并从中提取整数.一些值得注意的要点:
I'm reading a file, line by line, and extracting integers from it. Some noteworthy points:
- 输入文件不是二进制文件;
- 我无法将整个文件加载到内存中;
-
文件格式(仅整数,用一些定界符分隔):
- the input file is not in binary;
- I cannot load up the whole file in memory;
file format (only integers, separated by some delimiter):
x1 x2 x3 x4 ...
y1 y2 y3 ...
z1 z2 z3 z4 z5 ...
...
只是添加上下文,我正在读取整数,并使用std::unordered_map<unsigned int, unsinged int>
对其进行计数.
Just to add context, I'm reading the integers, and counting them, using an std::unordered_map<unsigned int, unsinged int>
.
只需遍历行并分配无用的字符串流,如下所示:
Simply looping through lines, and allocating useless stringstreams, like this:
std::fstream infile(<inpath>, std::ios::in);
while (std::getline(infile, line)) {
std::stringstream ss(line);
}
为我提供700MB文件的大约2.7秒.
gives me ~2.7s for a 700MB file.
解析每一行:
unsigned int item;
std::fstream infile(<inpath>, std::ios::in);
while (std::getline(infile, line)) {
std::stringstream ss(line);
while (ss >> item);
}
给我约17.8秒的时间.
Gives me ~17.8s for the same file.
如果我将运算符更改为std::getline
+ atoi
:
If I change the operator to a std::getline
+ atoi
:
unsigned int item;
std::fstream infile(<inpath>, std::ios::in);
while (std::getline(infile, line)) {
std::stringstream ss(line);
while (std::getline(ss, token, ' ')) item = atoi(token.c_str());
}
大约14.6秒.
有没有比这些方法更快的方法?我认为没有必要加快文件读取速度,只需解析自身即可-两者都不会造成任何危害,尽管(:
Is there anything faster than these approaches? I don't think it's necessary to speed up the file reading, just the parsing itself -- both wouldn't make no harm, though (:
推荐答案
该程序
#include <iostream>
int main ()
{
int num;
while (std::cin >> num) ;
}
大约需要17秒才能读取文件.这段代码
needs about 17 seconds to read a file. This code
#include <iostream>
int main()
{
int lc = 0;
int item = 0;
char buf[2048];
do
{
std::cin.read(buf, sizeof(buf));
int k = std::cin.gcount();
for (int i = 0; i < k; ++i)
{
switch (buf[i])
{
case '\r':
break;
case '\n':
item = 0; lc++;
break;
case ' ':
item = 0;
break;
case '0': case '1': case '2': case '3':
case '4': case '5': case '6': case '7':
case '8': case '9':
item = 10*item + buf[i] - '0';
break;
default:
std::cerr << "Bad format\n";
}
}
} while (std::cin);
}
同一文件需要1.25秒.随心所欲...
needs 1.25 seconds for the same file. Make what you want of it...
这篇关于在C ++中加快从文件中读取整数的速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!