从函数 - 复制成本返回STL向量 [英] Returning an STL vector from a function - copy cost

查看:138
本文介绍了从函数 - 复制成本返回STL向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从函数返回stl向量时:

When you return an stl vector from a function:

vector<int> getLargeArray() {  ...  }

返回是一个昂贵的复制操作吗?我记得在某个地方读取向量赋值是快的 - 是否应该要求调用者传递一个引用?

Is the return going to be an expensive copy operation? I remember reading somewhere that vector assignment being fast -- should I require the caller to pass a reference instead?

void getLargeArray( vector<int>& vec ) {  ...  }


推荐答案

假设你的函数构造和返回新数据,你应该返回值,并尝试确保函数本身有一个返回类型 vector ,或者在最坏的情况下,几个返回点都返回相同的变量。

Assuming your function constructs and returns new data, you should return by value, and try to make sure that the function itself has one return point that returns a variable of type vector<int>, or at worst several return points that all return the same variable.

这将确保您在任何可信的编译器上获得命名的返回值优化, 一个潜在副本(从函数中的值到返回值的一个副本)。还有其他方法可以获得返回值优化,但它不是完全可预测的,所以简单的规则是安全的。

That ensures that you'll get the named return value optimization on any credible compiler, which eliminates one of the potential copies (the one from the value in the function, to the return value). There are other ways to get a return value optimization, but it's not wholly predictable so the simple rule plays safe.

接下来,你要从返回中消除潜在的副本值到任何调用者用它。这是要解决的调用者的问题,而不是被调用者,并且基本上有三种方法来做到这一点:

Next, you want to eliminate the potential copy from the return value to whatever the caller does with it. It's the caller's problem to solve, not the callee's, and there are basically three ways to do this:


  • 使用对函数的调用向量< int> 的初始化程序,在这种情况下,任何可信的C ++编译器都会复制该副本。

  • 使用C ++ 11,向量有移动语义。

  • 在C ++ 03中,使用swaptimization。

  • Use the call to the function as the initializer for a vector<int>, in which case again any credible C++ compiler will elide the copy.
  • Use C++11, where vector has move semantics.
  • In C++03, use "swaptimization".

也就是说,在C ++ 03 不要

That is, in C++03 don't write

vector<int> v;
// use v for some stuff
// ...
// now I want fresh data in v:
v = getLargeArray();

而是:

getLargeArray().swap(v);

这避免了<$ c $的所需拷贝分配(不能省略[*] c> v = getLargeArray()。在C ++ 11中不需要它,在那里有一个便宜的移动赋值,而不是昂贵的副本分配,但当然它仍然可以工作。

This avoids the copy assignment that's required (must not be elided[*]) for v = getLargeArray(). It's not needed in C++11, where there's a cheap move assignment instead of the expensive copy assignment, but of course it still works.

另一件需要考虑的是你实际上希望将向量作为接口的一部分。你可以编写一个函数模板,它接受一个输出迭代器,并将数据写入该输出迭代器。想要向量中的数据的调用者然后可以传递 std :: back_inserter 的结果,因此可以在 deque中需要数据的调用者列表。事先知道数据大小的调用者甚至可以只传递一个向量迭代器(适当地 resize() d)或者指向一个足够大的数组的原始指针,以避免 back_insert_iterator 的开销。有非模板方法做同样的事情,但是他们很可能会以一种或另一种方式产生呼叫开销。如果你担心每个元素复制 int 的成本,那么你担心每个元素的函数调用的成本。

Another thing to consider is whether you actually want vector as part of your interface. You could instead perhaps write a function template that takes an output iterator, and writes the data to that output iterator. Callers who want the data in a vector can then pass in the result of std::back_inserter, and so can callers who want the data in a deque or list. Callers who know the size of the data in advance could even pass just a vector iterator (suitably resize()d first) or a raw pointer to a large enough array, to avoid the overhead of back_insert_iterator. There are non-template ways of doing the same thing, but they'll most likely incur a call overhead one way or another. If you're worried about the cost of copying an int per element, then you're worried about the cost of a function call per element.

如果你的函数不构造和返回新数据,而是返回一些现有向量< int> t允许更改原件,那么当您按值返回时,您不能避免至少一个副本。因此,如果性能是一个成熟的问题,那么您需要查看一些API,而不是按值返回。例如,你可以提供一对迭代器,可以用来遍历内部数据,一个函数,通过索引在向量中查找值,甚至(如果性能问题是如此严重,以致保证暴露你的内部)对向量的引用。显然,在所有这些情况下,你改变了函数的意义 - 现在,而不是给调用者自己的数据,它提供了一个人的数据的视图,这可能会改变。

If your function doesn't construct and return new data, but rather it returns the current contents of some existing vector<int> and isn't allowed to change the original, then you can't avoid at least one copy when you return by value. So if the performance of that is a proven problem, then you need to look at some API other than return-by-value. For example you might supply a pair of iterators that can be used to traverse the internal data, a function to look up a value in the vector by index, or even (if the performance problem is so serious as to warrant exposing your internals), a reference to the vector. Obviously in all those cases you change the meaning of the function -- now instead of giving the caller "their own data" it provides a view of someone else's data, which might change.

[*]当然,似乎规则仍然适用,可以想象一个C ++实现,它足够聪明地意识到,因为这是一个简单的可复制类型的向量( int ),并且因为你没有任何指针到任何元素(我假设),那么它可以交换,结果是仿佛复制。但我不会指望它。

[*] of course the "as if" rule still applies, and one can imagine a C++ implementation that's smart enough to realise that since this is a vector of a trivially copyable type (int), and since you haven't taken any pointers to any elements (I assume), then it can swap instead and the result is "as if" it copied. But I wouldn't count on it.

这篇关于从函数 - 复制成本返回STL向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆