为什么std :: fstreams这么慢? [英] Why are std::fstreams so slow?
问题描述
我正在研究一个简单的解析器,当我观察到瓶颈是在...文件读取!我提取了非常简单的测试来比较读取大块数据时 fstreams
和 FILE *
的性能: p>
#include< stdio.h>
#include< chrono>
#include< fstream>
#include< iostream>
#include< functional>
void measure(const std :: string& test,std :: function< void()> function)
{
auto start_time = std :: chrono :: high_resolution_clock ::现在();
function();
auto duration = std :: chrono :: duration_cast< std :: chrono :: nanoseconds>(std :: chrono :: high_resolution_clock :: now() - start_time);
std :: cout<<< test<<<< static_cast< double>(duration.count())* 0.000001<ms< std :: endl;
}
#define BUFFER_SIZE(1024 * 1024 * 1024)
int main(int argc,const char * argv [])
{
auto buffer = new char [BUFFER_SIZE];
memset(buffer,123,BUFFER_SIZE);
measure(FILE * write,[buffer]()
{
FILE * file = fopen(test_file_write,wb);
fwrite (buffer,1,BUFFER_SIZE,file);
fclose(file);
});
measure(FILE * read,[buffer]()
{
FILE * file = fopen(test_file_read,rb); ,BUFFER_SIZE,file);
fclose(file);
});
measure(fstream write,[buffer]()
{
std :: ofstream stream(test_stream_write,std :: ios :: binary);
stream。 write(buffer,BUFFER_SIZE);
});
measure(fstream read,[buffer]()
{
std :: ifstream stream(test_stream_read,std :: ios :: binary);
stream。 read(buffer,BUFFER_SIZE);
});
delete [] buffer;
}
在我的机器上运行此代码的结果是:
FILE *写入1388.59 ms
FILE *读取1292.51 ms
fstream写入3105.38 ms
fstream读取3319.82 ms
fstream
FILE *
写/读!这时,读取一个大数据块,没有任何解析或 fstreams
的其他功能。我在Mac OS,Intel I7 2.6GHz,16GB 1600 MHz RAM,SSD驱动器上运行代码。请注意,再次运行相同的代码, FILE * read
的时间非常低(约200 ms),可能是因为文件被缓存...这是为什么打开的文件
为什么在使用 fstream
只读一个二进制数据的blob时比较 FILE *
?
编辑1:我更新了代码和时间。抱歉迟到!
编辑2:我添加了命令行和新结果(非常类似于前几个!)
$ clang ++ main.cpp -std = c ++ 11 -stdlib = libc ++ -O3
$ ./a.out
FILE * write 1417.9 ms
FILE * read 1292.59 ms
fstream write 3214.02 ms
fstream read 3052.56 ms
按照第二次运行的结果:
$ ./a.out
FILE * write 1428.98 ms
FILE * read 196.902 ms
fstream write 3343.69 ms
fstream read 2285.93 ms
当读取 FILE *
和流时看起来文件会被缓存$ c $
FILE * file = fopen(test_file_write,wb);
fwrite(buffer,1,BUFFER_SIZE,file);
fclose(file);
std :: ofstream stream(test_stream_write,std :: ios :: binary);
stream.write(buffer,BUFFER_SIZE);
并启动了分析器。看起来像 stream
在 xsputn
函数中花费了大量时间,而实际的 code>调用具有相同的持续时间(因为它应该是,它是相同的函数...)
自符号名称
3266.0ms 66.9%0,0 std :: __ 1 :: basic_ostream< char,std :: __ 1 :: char_traits< char> > :: write(char const *,long)
3265.0ms 66.9%2145,0 std :: __ 1 :: basic_streambuf< char,std :: __ 1 :: char_traits< char> > :: xsputn(char const *,long)
1120.0ms 22.9%7,0 std :: __ 1 :: basic_filebuf< char,std :: __ 1 :: char_traits< char> > :: overflow(int)
1112.0ms 22.7%2,0 fwrite
1127.0ms 23.0%0,0 fwrite
编辑4 由于某种原因,此问题被标记为重复。我想指出,我根本不使用 printf
,我只使用 std :: cout
时间。 read
部分中使用的文件是 write
部分的输出,使用不同的名称复制以避免缓存
看起来,在Linux上,对于这么大的数据集, fwrite
更有效率,因为它使用
写
而不是 writev
。
我不确定WHY writev
是否比写
,但似乎是差异在哪里。我看到绝对没有真正的原因,为什么 fstream
需要使用这种情况下的结构。
这可以通过使用 strace ./a.out
(其中 a .out
是程序测试此)。
输出:
Fstream:
clock_gettime(CLOCK_REALTIME,{1411978373,114560081})= 0
open(test,O_WRONLY | O_CREAT | O_TRUNC,0666)= 3
writev NULL,0},{\0\0\0\0\0\0\0\0\0\0\0\0\0\0\\ \\ 0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0...,1073741824}],2)= 1073741824
close(3)= 0
clock_gettime(CLOCK_REALTIME,{1411978386,376353883})= 0
write fstream write 13261.8 ms\\\
,25fstream write 13261.8 ms)= 25
FILE *:
clock_gettime(CLOCK_REALTIME,{1411978386,930326134})= 0
open(test,O_WRONLY | O_CREAT | O_TRUNC ,0666)= 3
write(3,\0\0\0\0\0\0\0\0\0\0\0\0 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ 0 \0\0\0...,1073741824)= 1073741824
clock_gettime(CLOCK_REALTIME,{1411978388,584197782})= 0
write(1,FILE * write 1653.87 ms\\ \\ n,23FILE * write 1653.87 ms)= 23
,所以我的机器会有点慢 - 或者其他的东西在我的情况下更慢。
正如Jan Hudec所指出的,我误解了结果。我只是这样写:
#include< sys / types.h>
#include< sys / stat.h>
#include< fcntl.h>
#include< sys / uio.h>
#include< unistd.h>
#include< iostream>
#include< cstdlib>
#include< cstring>
#include< functional>
#include< chrono>
void measure(const std :: string& test,std :: function< void()> function)
{
auto start_time = std :: chrono :: high_resolution_clock ::现在();
function();
auto duration = std :: chrono :: duration_cast< std :: chrono :: nanoseconds>(std :: chrono :: high_resolution_clock :: now() - start_time);
std :: cout<<< test<<<< static_cast< double>(duration.count())* 0.000001<ms< std :: endl;
}
#define BUFFER_SIZE(1024 * 1024 * 1024)
int main()
{
自动缓冲= new char [BUFFER_SIZE];
memset(buffer,0,BUFFER_SIZE);
measure(writev,[buffer]()
{
int fd = open(test,O_CREAT | O_WRONLY);
struct iovec vec [ ] =
{
{NULL,0},
{(void *)buffer,BUFFER_SIZE}
};
writev(fd,vec,sizeof(vec) / sizeof(vec [0]));
close(fd);
});
measure(write,[buffer]()
{
int fd = open(test,O_CREAT | O_WRONLY);
write buffer,BUFFER_SIZE);
close(fd);
});
}
这是实际的 fstream
实现,做某事愚蠢 - 可能复制整个数据在小块,某处和某种方式,或类似的东西。我会尝试进一步了解。
对于这两种情况,结果几乎相同,并且比 fstream
和 FILE *
变体。
编辑:
似乎在我的机器上,如果你添加 fclose(file)
在写入后,对于 fstream
和 FILE花费大约相同的时间*
- 在我的系统上,大约13秒写入1GB - 使用旧式旋转磁盘类型驱动器,而不是SSD。
使用此代码更快速:
#include< sys / types.h>
#include< sys / stat.h>
#include< fcntl.h>
#include< sys / uio.h>
#include< unistd.h>
#include< iostream>
#include< cstdlib>
#include< cstring>
#include< functional>
#include< chrono>
void measure(const std :: string& test,std :: function< void()> function)
{
auto start_time = std :: chrono :: high_resolution_clock ::现在();
function();
auto duration = std :: chrono :: duration_cast< std :: chrono :: nanoseconds>(std :: chrono :: high_resolution_clock :: now() - start_time);
std :: cout<<< test<<<< static_cast< double>(duration.count())* 0.000001<ms< std :: endl;
}
#define BUFFER_SIZE(1024 * 1024 * 1024)
int main()
{
自动缓冲= new char [BUFFER_SIZE];
memset(buffer,0,BUFFER_SIZE);
measure(writev,[buffer]()
{
int fd = open(test,O_CREAT | O_WRONLY,0660);
struct iovec vec [] =
{
{NULL,0},
{(void *)buffer,BUFFER_SIZE}
};
writev(fd,vec,sizeof vec)/ sizeof(vec [0]));
close(fd);
});
measure(write,[buffer]()
{
int fd = open(test,O_CREAT | O_WRONLY,0660);
write fd,buffer,BUFFER_SIZE);
close(fd);
});
}
给出约650-900毫秒的时间。
我也可以编辑原始程序,给予 fwrite
大约1000ms的时间 - 只需删除 fclose
。
我也添加了这个方法:
measure (new),[buffer]()
{
std :: ofstream * stream = new std :: ofstream(test,std :: ios :: binary);
- > write(buffer,BUFFER_SIZE);
//故意不删除
});
,这里也需要大约1000 ms。
所以,我的结论是,不知何故,有时,关闭文件使它冲洗到磁盘。在其他情况下,它不会。我还是不明白为什么...
I was working on a simple parser and when profiling I observed the bottleneck is in... file read! I extracted very simple test to compare the performance of fstreams
and FILE*
when reading a big blob of data:
#include <stdio.h>
#include <chrono>
#include <fstream>
#include <iostream>
#include <functional>
void measure(const std::string& test, std::function<void()> function)
{
auto start_time = std::chrono::high_resolution_clock::now();
function();
auto duration = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now() - start_time);
std::cout<<test<<" "<<static_cast<double>(duration.count()) * 0.000001<<" ms"<<std::endl;
}
#define BUFFER_SIZE (1024 * 1024 * 1024)
int main(int argc, const char * argv[])
{
auto buffer = new char[BUFFER_SIZE];
memset(buffer, 123, BUFFER_SIZE);
measure("FILE* write", [buffer]()
{
FILE* file = fopen("test_file_write", "wb");
fwrite(buffer, 1, BUFFER_SIZE, file);
fclose(file);
});
measure("FILE* read", [buffer]()
{
FILE* file = fopen("test_file_read", "rb");
fread(buffer, 1, BUFFER_SIZE, file);
fclose(file);
});
measure("fstream write", [buffer]()
{
std::ofstream stream("test_stream_write", std::ios::binary);
stream.write(buffer, BUFFER_SIZE);
});
measure("fstream read", [buffer]()
{
std::ifstream stream("test_stream_read", std::ios::binary);
stream.read(buffer, BUFFER_SIZE);
});
delete[] buffer;
}
The results of running this code on my machine are:
FILE* write 1388.59 ms
FILE* read 1292.51 ms
fstream write 3105.38 ms
fstream read 3319.82 ms
fstream
write/read are about 2 times slower than FILE*
write/read! And this while reading a big blob of data, without any parsing or other features of fstreams
. I'm running the code on Mac OS, Intel I7 2.6GHz, 16GB 1600 MHz Ram, SSD drive. Please note that running again same code the time for FILE* read
is very low (about 200 ms) probably because the file gets cached... This is why the files opened for reading are not created using the code.
Why when reading just a blob of binary data using fstream
is so slow compared to FILE*
?
EDIT 1: I updated the code and the times. Sorry for the delay!
EDIT 2: I added command line and new results (very similar to previous ones!)
$ clang++ main.cpp -std=c++11 -stdlib=libc++ -O3
$ ./a.out
FILE* write 1417.9 ms
FILE* read 1292.59 ms
fstream write 3214.02 ms
fstream read 3052.56 ms
Following the results for the second run:
$ ./a.out
FILE* write 1428.98 ms
FILE* read 196.902 ms
fstream write 3343.69 ms
fstream read 2285.93 ms
It looks like the file gets cached when reading for both FILE*
and stream
as the time reduces with the same amount for both of them.
EDIT 3: I reduced the code to this:
FILE* file = fopen("test_file_write", "wb");
fwrite(buffer, 1, BUFFER_SIZE, file);
fclose(file);
std::ofstream stream("test_stream_write", std::ios::binary);
stream.write(buffer, BUFFER_SIZE);
And started the profiler. It seems like stream
spends lots of time in xsputn
function, and the actual write
calls have the same duration (as it should be, it's the same function...)
Running Time Self Symbol Name
3266.0ms 66.9% 0,0 std::__1::basic_ostream<char, std::__1::char_traits<char> >::write(char const*, long)
3265.0ms 66.9% 2145,0 std::__1::basic_streambuf<char, std::__1::char_traits<char> >::xsputn(char const*, long)
1120.0ms 22.9% 7,0 std::__1::basic_filebuf<char, std::__1::char_traits<char> >::overflow(int)
1112.0ms 22.7% 2,0 fwrite
1127.0ms 23.0% 0,0 fwrite
EDIT 4 For some reason this question is marked as duplicate. I wanted to point out that I don't use printf
at all, I use only std::cout
to write the time. The files used in the read
part are the output from the write
part, copied with different name to avoid caching
It would seem that, on Linux, for this large set of data, the implementation of fwrite
is much more efficient, since it uses write
rather than writev
.
I'm not sure WHY writev
is so much slower than write
, but that appears to be where the difference is. And I see absolutely no real reason as to why the fstream
needs to use that construct in this case.
This can easily be seen by using strace ./a.out
(where a.out
is the program testing this).
Output:
Fstream:
clock_gettime(CLOCK_REALTIME, {1411978373, 114560081}) = 0
open("test", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
writev(3, [{NULL, 0}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1073741824}], 2) = 1073741824
close(3) = 0
clock_gettime(CLOCK_REALTIME, {1411978386, 376353883}) = 0
write(1, "fstream write 13261.8 ms\n", 25fstream write 13261.8 ms) = 25
FILE*:
clock_gettime(CLOCK_REALTIME, {1411978386, 930326134}) = 0
open("test", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1073741824) = 1073741824
clock_gettime(CLOCK_REALTIME, {1411978388, 584197782}) = 0
write(1, "FILE* write 1653.87 ms\n", 23FILE* write 1653.87 ms) = 23
I don't have them fancy SSD drives, so my machine will be a bit slower on that - or something else is slower in my case.
As pointed out by Jan Hudec, I'm misinterpreting the results. I just wrote this:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/uio.h>
#include <unistd.h>
#include <iostream>
#include <cstdlib>
#include <cstring>
#include <functional>
#include <chrono>
void measure(const std::string& test, std::function<void()> function)
{
auto start_time = std::chrono::high_resolution_clock::now();
function();
auto duration = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now() - start_time);
std::cout<<test<<" "<<static_cast<double>(duration.count()) * 0.000001<<" ms"<<std::endl;
}
#define BUFFER_SIZE (1024 * 1024 * 1024)
int main()
{
auto buffer = new char[BUFFER_SIZE];
memset(buffer, 0, BUFFER_SIZE);
measure("writev", [buffer]()
{
int fd = open("test", O_CREAT|O_WRONLY);
struct iovec vec[] =
{
{ NULL, 0 },
{ (void *)buffer, BUFFER_SIZE }
};
writev(fd, vec, sizeof(vec)/sizeof(vec[0]));
close(fd);
});
measure("write", [buffer]()
{
int fd = open("test", O_CREAT|O_WRONLY);
write(fd, buffer, BUFFER_SIZE);
close(fd);
});
}
It is the actual fstream
implementation that does something daft - probably copying the whole data in small chunks, somewhere and somehow, or something like that. I will try to find out further.
And the result is pretty much identical for both cases, and faster than both fstream
and FILE*
variants in the question.
Edit:
It would seem like, on my machine, right now, if you add fclose(file)
after the write, it takes approximately the same amount of time for both fstream
and FILE*
- on my system, around 13 seconds to write 1GB - with old style spinning disk type drives, not SSD.
I can however write MUCH faster using this code:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/uio.h>
#include <unistd.h>
#include <iostream>
#include <cstdlib>
#include <cstring>
#include <functional>
#include <chrono>
void measure(const std::string& test, std::function<void()> function)
{
auto start_time = std::chrono::high_resolution_clock::now();
function();
auto duration = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::high_resolution_clock::now() - start_time);
std::cout<<test<<" "<<static_cast<double>(duration.count()) * 0.000001<<" ms"<<std::endl;
}
#define BUFFER_SIZE (1024 * 1024 * 1024)
int main()
{
auto buffer = new char[BUFFER_SIZE];
memset(buffer, 0, BUFFER_SIZE);
measure("writev", [buffer]()
{
int fd = open("test", O_CREAT|O_WRONLY, 0660);
struct iovec vec[] =
{
{ NULL, 0 },
{ (void *)buffer, BUFFER_SIZE }
};
writev(fd, vec, sizeof(vec)/sizeof(vec[0]));
close(fd);
});
measure("write", [buffer]()
{
int fd = open("test", O_CREAT|O_WRONLY, 0660);
write(fd, buffer, BUFFER_SIZE);
close(fd);
});
}
gives times of about 650-900 ms.
I can also edit the original program to give a time of approximately 1000ms for fwrite
- simply remove the fclose
.
I also added this method:
measure("fstream write (new)", [buffer]()
{
std::ofstream* stream = new std::ofstream("test", std::ios::binary);
stream->write(buffer, BUFFER_SIZE);
// Intentionally no delete.
});
and then it takes about 1000 ms here too.
So, my conclusion is that, somehow, sometimes, closing the file makes it flush to disk. In other cases, it doesn't. I still don't understand why...
这篇关于为什么std :: fstreams这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!