为什么迭代std :: array比迭代std :: vector快得多? [英] Why is iterating an std::array much faster than iterating an std::vector?
问题描述
编者注:
启用优化的跟进问题仅对循环计时:
为什么通过`std :: vector`进行迭代的速度比为什么快通过`std :: array`进行迭代?
在这里,我们可以看到延迟分配页面错误在读取未初始化的BSS内存与在定时循环之外初始化的动态分配+写入内存的影响.
Editor's note:
Followup question with optimization enabled that times only the loop:
Why is iterating though `std::vector` faster than iterating though `std::array`?
where we can see the effect of lazy-allocation page faults in reading uninitialized BSS memory vs. dynamically-allocated + written memory that's initialized outside the timed loop.
我尝试分析此代码:
#include <vector>
#include <array>
#include <stdio.h>
using namespace std;
constexpr int n = 400'000'000;
//vector<int> v(n);
array<int, n> v;
int main()
{
int res = 0;
for(int x : v)
res += x;
printf("%d\n", res);
}
在我的计算机上,array
版本比vector
快.
On my machine, the array
version is faster than vector
.
在这种情况下,内存分配是无关紧要的,因为它只有一次.
The memory allocation is irrelevant in this case as it's only once.
$ g++ arrVsVec.cpp -O3
$ time ./a.out
0
real 0m0,445s
user 0m0,203s
sys 0m0,238s
$ g++ arrVsVec.cpp -O3
$ time ./a.out
0
real 0m0,749s
user 0m0,273s
sys 0m0,476s
我发现对于std::vector
,反汇编要复杂得多: https://godbolt.org/z /111L5G
I have found the disassembly is much more complicated for std::vector
: https://godbolt.org/z/111L5G
推荐答案
针对(g++ -O2
)进行优化的答案:
Answer for optimization on (g++ -O2
):
您没有使用最终结果,因此编译器可以自由地优化整个循环.
You're not using the end result, so the compiler is free to optimize the entire loop out.
array
情况会发生什么情况.
Which is what happens in the array
case.
main:
xor eax, eax
ret
但是vector
分配和取消分配堆内存,这使优化变得复杂,并且编译器倾向于安全地使用它,并且
But the vector
allocates and deallocates heap memory, which complicates the optimization and the compilers tend to play it safe and leave it in place:
main:
xor eax, eax
ret
_GLOBAL__sub_I_v:
sub rsp, 8
mov edi, 400000000
mov QWORD PTR v[rip], 0
mov QWORD PTR v[rip+8], 0
mov QWORD PTR v[rip+16], 0
call operator new(unsigned long)
lea rdx, [rax+400000000]
mov QWORD PTR v[rip], rax
mov QWORD PTR v[rip+16], rdx
.L6:
mov DWORD PTR [rax], 0
add rax, 4
cmp rdx, rax
jne .L6
mov QWORD PTR v[rip+8], rdx
mov esi, OFFSET FLAT:v
mov edx, OFFSET FLAT:__dso_handle
mov edi, OFFSET FLAT:_ZNSt6vectorIiSaIiEED1Ev
add rsp, 8
jmp __cxa_atexit
所以这就是在这种情况下array
版本更快的原因.在更实际的应用程序中,差异不会那么大,并且主要取决于vector
情况下的堆分配/取消分配时间.
So that's why the array
version is faster in this particular case. In a more real-life application the difference wouldn't be that dramatic, and will mostly come down to the heap allocation/deallocation time for the vector
case.
关闭优化的答案(g++
):
Answer for optimization off (g++
):
未经优化就不要编译基准测试
Don't benchmark stuff compiled without optimization.
差异可能是由于未内联vector
迭代器.因此,与数组访问相比,在调试中访问向量元素会导致额外的间接访问.
The difference is probably due to the vector
iterator not being inlined. So accessing vector elements in debug more incurs an extra indirection compared to array access.
这篇关于为什么迭代std :: array比迭代std :: vector快得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!