矩阵/向量初始化性能 [英] Matrix/Vector initialization performance

查看:65
本文介绍了矩阵/向量初始化性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这更像是一个教育问题,我没有试图解决的具体问题.我想了解一下幕后"发生了什么?在以下场景中:

This is more of an educational question, there is no specific problem I am trying to solve. I would like some insight into what happens "behind the scenes" in the following scenario:

我们有 2 个 ints,wh,我们需要一个矩阵 (vector>;) 的 0 s.有多种方法可以做到这一点,我想知道哪一种表现最好(可能这意味着哪一种执行的副本最少).

We have 2 ints, w and h, and we need a matrix (vector<vector<int>>) of 0s. There are multiple ways to do this and I would like to know which one performs best (probably this means which one performs the least copies).

选项 1:

vector<vector<int>> m;

for (int i = 0; i < h; i++)
{
    m.push_back(vector<int>());
    for (j = 0; j < w; j++)
        m[i].push_back(0);
}

选项 2:

vector<vector<int>> m;

for (int i = 0; i < h; i++)
    m.push_back(vector<int>(w, 0));

选项 3:

vector<vector<int>> m(h, vector<int>(w, 0));

m.push_back(vector());/m.push_back(vector(w, 0));处的临时值在内存中创建然后也复制到 m 中?如果不是,使用选项 1 来最小化复制不是更好吗?(假设我们只讨论更大的数组,比如 1,000,000 x 1,000,000).选项 3 也有同样的困境;哪个往往更快(至少在纸面上),为什么会这样?

Is the temporary value at m.push_back(vector<int>()); / m.push_back(vector<int>(w, 0)); created in memory and then also copied into m? If it is wouldn't it be better to use option 1 to minimize copying? (suppose we are only talking about larger arrays, say 1,000,000 x 1,000,000). Same dilemmas for option 3; which tends to be faster (at least on paper) and why would it be?

推荐答案

如果您想要 Matrix 类的性能,请不要使用 std::vector<std::vector<T>> 放在首位.您编写了一个封装一维 std::vector 的适当类.一个vector的vector在内存中是碎片化的.

If you want performance for a Matrix class, you don't use std::vector<std::vector<T>> in the first place. You write a proper class that encapsulates a one-dimensional std::vector<T>. A vector of vectors is fragmented in memory.

现在在商业硬件上技术上可以实现 1 万亿元素矩阵,但要对其进行初始化,您真的非常需要多个线程.这是对您的 3 个示例的另一个实际反对意见.

A one-trillion element matrix is technically possible on commercial hardware nowadays, but to initialize it you really, really want multiple threads. That's another practical objection against your 3 examples.

话虽如此,对于小型实验,您的所有代码都可以.

Having said that, for small experiments all your code is fine.

这篇关于矩阵/向量初始化性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆