会降低OpenMP线程的性能 [英] fread slow performance in OpenMP threads
问题描述
我使用Intel Xeon x2(24个内核)和Windows Server2008.
试图并行化我的c ++程序.此处的模板代码:
I use Intel Xeon x2 (24 kernels) and Windows Server 2008.
Trying to parallelize my c++ program. Template code here:
vector< string > files;
vector< vector< float > > data;
...
data.resize( files.size() );
#pragma omp parallel for
for (int i=0; i<files.size(); i++) { // Files count is about 3000
FILE *f = fopen(files[i].c_str(), "rb");
// every file is about 40 mb
data[i].resize(someSize);
fread(&data[i][0], sizeof(float), someSize, f);
fclose(f);
...
performCalculations();
}
CPU使用率仅为0到5%.
当我插入而不是 fread(& data [i] [0],sizeof(float),someSize,f)时:
CPU Usage is only from 0 to 5%.
When I insert instead of fread(&data[i][0], sizeof(float), someSize, f):
for (int j=0; j<data.size(); j++) {
data[i][j] = rand();
}
CPU使用率增加到100%.
我已经尝试使用fstream和WinApi ReadFile,但效果不大.
CPU Usage increases to 100%.
I already tried to use fstream and WinApi ReadFile, but it didn't take an big effect.
我做错了什么?我不相信磁盘读取会如此之慢...
What am I doing wrong? I don't believe that the disk reading can be so slow...
推荐答案
我不认为磁盘读取可能会如此缓慢...
I don't believe that the disk reading can be so slowly...
那么你最好开始相信.与CPU相比,磁盘的速度慢得令人难以置信.并行I/O通常仅在从多个源(例如单独的磁盘或网络连接)读取时才有帮助.它可以很好地解决延迟问题,但不能解决带宽问题.
Then you better start believing. Disks are incredibly slow compared to CPUs. Parallel I/O usually only helps when you're reading from multiple sources such as separate disks or network connections. It can solve latency problems well, but not bandwidth problems.
尝试一次连续读取所有数据,然后在并行循环中对其进行处理.
Trying reading in all your data in one go, serially, and then processing it in a parallelized loop.
这篇关于会降低OpenMP线程的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!