为什么通过std :: vector进行迭代比通过std :: array进行迭代要快? [英] Why is iterating though `std::vector` faster than iterating though `std::array`?

查看:90
本文介绍了为什么通过std :: vector进行迭代比通过std :: array进行迭代要快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近问了这个问题: 为什么要更快地迭代std :: array而不是迭代std :: vector?

I recently asked this question: Why is iterating an std::array much faster than iterating an std::vector?

正如人们迅速指出的那样,我的基准测试存在许多缺陷.因此,当我尝试确定基准时,我注意到std::vector并不比std::array慢,实际上,情况恰恰相反.

As people quickly pointed out, my benchmark had many flaws. So as I was trying to fix my benchmark, I noticed that std::vector wasn't slower than std::array and, in fact, it was quite the opposite.

#include <vector>
#include <array>
#include <stdio.h>
#include <chrono>

using namespace std;

constexpr int n = 100'000'000;
vector<int> v(n);
//array<int, n> v;

int main()
{
    int res = 0;
    auto start = chrono::steady_clock::now();
    for(int x : v)
        res += x;
    auto end = chrono::steady_clock::now();
    auto diff = end - start;
    double elapsed =
        std::chrono::duration_cast<
            std::chrono::duration<double, std::milli>
        >(end - start).count();
    printf("result: %d\ntime: %f\n", res, elapsed);
}

我已尝试从以前的基准测试中进行改进:

Things I've tried to improve from my previous benchmark:

  • 确保我正在使用结果,所以整个循环没有得到优化
  • 使用-O3标志进行速度
  • 使用std::chrono代替time命令.这样一来,我们就可以隔离要测量的部分(仅用于for循环).变量的静态初始化以及类似的事情将无法衡量.
  • Made sure I'm using the result, so the whole loop is not optimized away
  • Using -O3 flag for speed
  • Use std::chrono instead of the time command. That's so we can isolate the part we want to measure (just the for loop). Static initialization of variables and things like that won't be measured.

测量时间:

数组:

$ g++ arrVsVec.cpp -O3
$ ./a.out
result: 0
time: 99.554109

vector:

$ g++ arrVsVec.cpp -O3
$ ./a.out
result: 0
time: 30.734491

我只是想知道这次我在做错什么.

I'm just wondering what I'm doing wrong this time.

观看在godbolt中的拆卸

推荐答案

差异是由于array的内存页未驻留在进程地址空间中(全局作用域数组存储在

The difference is due to memory pages of array not being resident in process address space (global scope array is stored in .bss section of the executable that hasn't been paged in, it is zero-initialized). Whereas vector has just been allocated and zero-filled, so its memory pages are already present.

如果添加

std::fill_n(v.data(), n, 1); // included in <algorithm>

作为main的第一行将页面带入(故障前),这使得array的时间与vector的时间相同.

as the first line of main to bring the pages in (pre-fault), that makes array time the same as that of vector.

在Linux上,相反,您可以执行mlock(v.data(), v.size() * sizeof(v[0]));来将页面放入地址空间.有关完整的详细信息,请参见 man mlock .

On Linux, instead of that, you can do mlock(v.data(), v.size() * sizeof(v[0])); to bring the pages into the address space. See man mlock for full details.

这篇关于为什么通过std :: vector进行迭代比通过std :: array进行迭代要快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆