为什么通过std :: vector进行迭代比通过std :: array进行迭代要快? [英] Why is iterating though `std::vector` faster than iterating though `std::array`?
问题描述
我最近问了这个问题: 为什么要更快地迭代std :: array而不是迭代std :: vector?
I recently asked this question: Why is iterating an std::array much faster than iterating an std::vector?
正如人们迅速指出的那样,我的基准测试存在许多缺陷.因此,当我尝试确定基准时,我注意到std::vector
并不比std::array
慢,实际上,情况恰恰相反.
As people quickly pointed out, my benchmark had many flaws. So as I was trying to fix my benchmark, I noticed that std::vector
wasn't slower than std::array
and, in fact, it was quite the opposite.
#include <vector>
#include <array>
#include <stdio.h>
#include <chrono>
using namespace std;
constexpr int n = 100'000'000;
vector<int> v(n);
//array<int, n> v;
int main()
{
int res = 0;
auto start = chrono::steady_clock::now();
for(int x : v)
res += x;
auto end = chrono::steady_clock::now();
auto diff = end - start;
double elapsed =
std::chrono::duration_cast<
std::chrono::duration<double, std::milli>
>(end - start).count();
printf("result: %d\ntime: %f\n", res, elapsed);
}
我已尝试从以前的基准测试中进行改进:
Things I've tried to improve from my previous benchmark:
- 确保我正在使用结果,所以整个循环没有得到优化
- 使用
-O3
标志进行速度 - 使用
std::chrono
代替time
命令.这样一来,我们就可以隔离要测量的部分(仅用于for循环).变量的静态初始化以及类似的事情将无法衡量.
- Made sure I'm using the result, so the whole loop is not optimized away
- Using
-O3
flag for speed - Use
std::chrono
instead of thetime
command. That's so we can isolate the part we want to measure (just the for loop). Static initialization of variables and things like that won't be measured.
测量时间:
数组:
$ g++ arrVsVec.cpp -O3
$ ./a.out
result: 0
time: 99.554109
vector:
$ g++ arrVsVec.cpp -O3
$ ./a.out
result: 0
time: 30.734491
我只是想知道这次我在做错什么.
I'm just wondering what I'm doing wrong this time.
推荐答案
差异是由于array
的内存页未驻留在进程地址空间中(全局作用域数组存储在已被零初始化). vector
刚刚被分配并填充为零,因此它的内存页已经存在.
The difference is due to memory pages of array
not being resident in process address space (global scope array is stored in .bss
section of the executable that hasn't been paged in, it is zero-initialized). Whereas vector
has just been allocated and zero-filled, so its memory pages are already present.
如果添加
std::fill_n(v.data(), n, 1); // included in <algorithm>
作为main
的第一行将页面带入(故障前),这使得array
的时间与vector
的时间相同.
as the first line of main
to bring the pages in (pre-fault), that makes array
time the same as that of vector
.
在Linux上,相反,您可以执行mlock(v.data(), v.size() * sizeof(v[0]));
来将页面放入地址空间.有关完整的详细信息,请参见 man mlock
.
On Linux, instead of that, you can do mlock(v.data(), v.size() * sizeof(v[0]));
to bring the pages into the address space. See man mlock
for full details.
这篇关于为什么通过std :: vector进行迭代比通过std :: array进行迭代要快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!