为什么 std::vector 可能比原始动态分配的数组更快? [英] Why might std::vector be faster than a raw dynamically allocated array?

查看:62
本文介绍了为什么 std::vector 可能比原始动态分配的数组更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与同事讨论的结果我最终编写了基准来测试 std::vector 与原始动态分配的数组,结果出人意料.

The result of a discussion with a colleague I ended up writing benchmarks to test std::vector vs raw dynamically allocated arrays, and ended up with a surprise.

我的测试如下:

#include "testconsts.h" // defines NUM_INTS across all tests
#include <vector>

int main()
{
    const int numInts = NUM_INTS;
    std::vector<int>                     intVector( numInts );
    int * const                          intArray       = new int[ numInts ];

    ++intVector[0]; // force access to affect optimization
    ++intArray[0];  // force access to affect optimization

    for( int i = 0; i < numInts; ++i )
    {
        ++intArray[i];
    }

    delete[] intArray;
    return 0;
}

和:

#include "testconsts.h" // defines NUM_INTS across all tests
#include <vector>

int main()
{
    const int numInts = NUM_INTS;
    std::vector<int>                     intVector( numInts );
    int *                                intArray       = new int[ numInts ];

    ++intArray[0];  // force access to affect optimization
    ++intVector[0]; // force access to affect optimization

    for( int i = 0; i < numInts; ++i )
    {
        ++intVector[i];
    }

    delete[] intArray;
    return 0;
}

它们是用 g++ -O3 和 gcc 4.4.3 编译的

They are compiled with g++ -O3 with gcc 4.4.3

使用 time 进行多次基准测试的结果类似于:

The results of multiple runs of benchmarking using time are similar to:

数组:

real    0m0.757s
user    0m0.176s
sys     0m0.588s

矢量:

real    0m0.572s
user    0m0.268s
sys     0m0.304s

三件事很清楚:

  1. 数组在用户时间上更快
  2. Vector 速度更快,系统时间更少
  3. 总体而言,vector 赢得了这场战斗

问题是为什么?".

我猜测系统时间问题肯定与页面错误有关,但我无法准确地描述为什么会有更多的页面错误.

The system time issue I'm guessing must have to do with page faults, but I can't describe for myself exactly why one would have significantly more page faults.

至于用户时间问题,我不太感兴趣,但我仍然很好奇对此的看法.我曾想象它与初始化有关,但我没有将初始化值传递给向量构造函数,所以我不知道.

As for the user time issue, it's less interesting to me, but I'm still curious of opinions on that as well. I had imagined it had something to do with initialization, though I'm not passing an initialization value to the vector constructor so I don't know.

推荐答案

与动态数组相比,不同之处不在于向量的性能,而在于您执行的内存访问次数.

The difference is not in the performance of the vector compared to the dynamic array, but in the number of accesses to memory that you perform.

实际上,在向量测试中,您正在重新访问缓存内存,而在数组版本中则没有.在任何一种情况下,您都需要为缓存矢量版本付出代价.

Effectively, in the vector test you are re-accessing cached memory, while in the array version you don't. You pay the price of caching the vector version in either case.

在向量测试中,您为数组分配动态内存但保持原样,永远不会触及内存,因此不会因该操作而出现页面错误.向量被创建、初始化,然后第二遍将访问已经缓存的数据(如果大小适合缓存,如果不适合,它不会在缓存中,但在两个版本中将产生相同的成本).

In the vector test, you allocate the dynamic memory for the array but leave it untouched, with the memory never being touched there are no page faults due to that operation. The vector is created, initialized and then the second pass will be accessing already cached data (if the size fits the cache, if it does not, it will not be in cache, but the same cost will be incurred in both versions).

另一方面,在测试数组时,向量构造函数初始化元素,这意味着如果您试图分析数组的行为,则遍历向量内容遍历数组元素.使应用程序使用的内存访问次数、页面错误和内存增加一倍.

On the other hand when testing the array, the vector constructor initializes the elements, and that means that in the case were you are trying to profile the behavior of the array, the vector contents are walked over and the array elements are walked over. Double the number of memory accesses, page faults and memory used by the application.

您可以尝试修改代码,以便像这样执行动态分配:

You can try modifying the code so that the dynamic allocation is performed like this:

int * intArray = new int[ numInts ](); // note extra ()

哪个将值初始化整个数组,或者你如何初始化数组内容.运行该测试的修改版本的结果应该是相似的.

Which will value-initialize the whole array, or you initialize the array contents else how. The results of running that modified version of the test should be similar.

这篇关于为什么 std::vector 可能比原始动态分配的数组更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆