为什么迭代std :: array比迭代std :: vector快得多? [英] Why is iterating an std::array much faster than iterating an std::vector?

查看:99
本文介绍了为什么迭代std :: array比迭代std :: vector快得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编者注:
启用优化的跟进问题仅对循环计时:
为什么通过`std :: vector`进行迭代的速度比为什么快通过`std :: array`进行迭代?
在这里,我们可以看到延迟分配页面错误在读取未初始化的BSS内存与在定时循环之外初始化的动态分配+写入内存的影响.

Editor's note:
Followup question with optimization enabled that times only the loop:
Why is iterating though `std::vector` faster than iterating though `std::array`?
where we can see the effect of lazy-allocation page faults in reading uninitialized BSS memory vs. dynamically-allocated + written memory that's initialized outside the timed loop.

我尝试分析此代码:

#include <vector>
#include <array>
#include <stdio.h>

using namespace std;

constexpr int n = 400'000'000;
//vector<int> v(n);
array<int, n> v;

int main()
{
    int res = 0;
    for(int x : v)
        res += x;
    printf("%d\n", res);
}

在我的计算机上,array版本比vector快.

On my machine, the array version is faster than vector.

在这种情况下,内存分配是无关紧要的,因为它只有一次.

The memory allocation is irrelevant in this case as it's only once.

$ g++ arrVsVec.cpp -O3
$ time ./a.out
0

real    0m0,445s
user    0m0,203s
sys 0m0,238s

$ g++ arrVsVec.cpp -O3
$ time ./a.out
0

real    0m0,749s
user    0m0,273s
sys 0m0,476s

我发现对于std::vector,反汇编要复杂得多: https://godbolt.org/z /111L5G

I have found the disassembly is much more complicated for std::vector: https://godbolt.org/z/111L5G

推荐答案

针对(g++ -O2)进行优化的答案:

Answer for optimization on (g++ -O2):

您没有使用最终结果,因此编译器可以自由地优化整个循环.

You're not using the end result, so the compiler is free to optimize the entire loop out.

array情况会发生什么情况.

Which is what happens in the array case.

main:
        xor     eax, eax
        ret

但是vector分配和取消分配堆内存,这使优化变得复杂,并且编译器倾向于安全地使用它,并且

But the vector allocates and deallocates heap memory, which complicates the optimization and the compilers tend to play it safe and leave it in place:

main:
        xor     eax, eax
        ret
_GLOBAL__sub_I_v:
        sub     rsp, 8
        mov     edi, 400000000
        mov     QWORD PTR v[rip], 0
        mov     QWORD PTR v[rip+8], 0
        mov     QWORD PTR v[rip+16], 0
        call    operator new(unsigned long)
        lea     rdx, [rax+400000000]
        mov     QWORD PTR v[rip], rax
        mov     QWORD PTR v[rip+16], rdx
.L6:
        mov     DWORD PTR [rax], 0
        add     rax, 4
        cmp     rdx, rax
        jne     .L6
        mov     QWORD PTR v[rip+8], rdx
        mov     esi, OFFSET FLAT:v
        mov     edx, OFFSET FLAT:__dso_handle
        mov     edi, OFFSET FLAT:_ZNSt6vectorIiSaIiEED1Ev
        add     rsp, 8
        jmp     __cxa_atexit

所以这就是在这种情况下array版本更快的原因.在更实际的应用程序中,差异不会那么大,并且主要取决于vector情况下的堆分配/取消分配时间.

So that's why the array version is faster in this particular case. In a more real-life application the difference wouldn't be that dramatic, and will mostly come down to the heap allocation/deallocation time for the vector case.

关闭优化的答案(g++):

Answer for optimization off (g++):

未经优化就不要编译基准测试

Don't benchmark stuff compiled without optimization.

差异可能是由于未内联vector迭代器.因此,与数组访问相比,在调试中访问向量元素会导致额外的间接访问.

The difference is probably due to the vector iterator not being inlined. So accessing vector elements in debug more incurs an extra indirection compared to array access.

这篇关于为什么迭代std :: array比迭代std :: vector快得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆