幕后的push_back()和emplace_back() [英] push_back() and emplace_back() behind the scenes

查看:138
本文介绍了幕后的push_back()和emplace_back()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在独自学习C ++,并且对push_back()emplace_back()的工作原理感到好奇.我一直以为emplace_back()在尝试构造大对象并将其推到容器的背面(例如矢量)时会更快.

假设我有一个Student对象,我想将其附加到学生向量的后面.

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           /* initialize member variables */ { }
};

假设我调用push_back()并将Student对象推到向量的末尾:

vector<Student> vec;
vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));

我在这里的理解是push_back在向量之外创建Student对象的实例,然后将其移到向量的后面.

图:

我也可以代替推入:

vector<Student> vec;
vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);

我在这里的理解是,Student对象是在向量的最后面构造的,因此不需要移动.

图:

因此,有意义的是嵌入会更快,特别是如果添加了许多Student对象.但是,当我为这两个版本的代码计时时:

for (int i = 0; i < 10000000; ++i) {
    vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));
}

for (int i = 0; i < 10000000; ++i) {
    vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);
}

我希望后者会更快,因为不必移动大型Student对象.奇怪的是,emplace_back版本最终变慢了(多次尝试).我还尝试插入10000000个Student对象,其中构造函数接受引用,并且push_back()emplace_back()中的参数存储在变量中.这也没有用,因为emplace仍然比较慢.

我已经检查以确保在两种情况下我都插入了相同数量的对象.时差并不太大,但是包扎的速度却慢了几秒钟.

我对push_back()emplace_back()的工作方式有误吗?非常感谢您的宝贵时间!

这是所要求的代码.我正在使用g ++编译器.

后退:

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           name(name_in), student_ID(ID_in), GPA(GPA_in), 
           favorite_food(food_in), favorite_prof(prof_in),
           hours_slept(sleep_in), birthyear(birthyear_in) {}
};

int main() {
    vector<Student> vec;
    vec.reserve(10000000);
    for (int i = 0; i < 10000000; ++i) 
         vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));
    return 0;
}

放回原位:

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           name(name_in), student_ID(ID_in), GPA(GPA_in), 
           favorite_food(food_in), favorite_prof(prof_in),
           hours_slept(sleep_in), birthyear(birthyear_in) {}
};

int main() {
    vector<Student> vec;
    vec.reserve(10000000);
    for (int i = 0; i < 10000000; ++i) 
         vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);
    return 0;
}

解决方案

此行为归因于std::string的复杂性.这里有几件事相互作用:

  • 小字符串优化(SSO)
  • push_back版本中,编译器能够在编译时确定字符串的长度,而对于emplace_back版本,编译器无法确定字符串的长度.因此,emplace_back调用需要调用strlen.此外,由于编译器不知道字符串文字的长度,因此它必须针对SSO和非SSO情况都发出代码(请参见Jason Turner的初始化程序列表已损坏,让我们修复它们" ;这是一个漫长的讨论,但他一直关注将字符串插入向量中的问题)

考虑这种简单的类型:

struct type {
  std::string a;
  std::string b;
  std::string c;

  type(std::string a, std::string b, std::string c)
    : a{a}
    , b{b}
    , c{c}
  {}
};

请注意构造函数如何复制 abc.

在仅分配内存的基准上进行测试,我们可以看到push_back的表现优于emplace_back:

单击图像以获取快速测试链接

由于示例中的字符串都适合SSO缓冲区,因此在这种情况下,复制与移动一样便宜.因此,构造函数非常有效,并且emplace_back的改进效果较小.

此外,如果我们在程序集中搜索对push_back的调用和对emplace_back:

// push_back call
void foo(std::vector<type>& vec) {
    vec.push_back({"Bob", "pizza", "Smith"});
}

// emplace_back call
void foo(std::vector<type>& vec) {
    vec.emplace_back("Bob", "pizza", "Smith");
}

(程序集未在此处复制.它很大.std::string很复杂)

我们可以看到emplace_back调用了strlen,而push_back没有.由于字符串文字和所构造的std::string之间的距离增加了,因此编译器无法优化对strlen的调用.

显式调用std::string构造函数将删除对strlen的调用,但不再在适当的位置构造它们,因此无法加快emplace_back的速度.

所有这一切,如果我们使用足够长的字符串离开SSO ,则分配成本完全淹没了这些细节,因此emplace_backpush_back具有相同的性能:

单击图像以获取快速基准链接


如果修复type的构造函数以移动其参数,则emplace_back在所有情况下都变得更快.

struct type {
  std::string a;
  std::string b;
  std::string c;

  type(std::string a, std::string b, std::string c)
    : a{std::move(a)}
    , b{std::move(b)}
    , c{std::move(c)}
  {}
};

SSO案例

单击图像以获取快速基准链接

长案子

单击图像以获取快速基准链接

但是,SSO push_back案件的速度放慢了;编译器似乎会发出额外的副本.

理想转发的最佳版本不受此缺点的影响(请注意,垂直轴):

struct type {
  std::string a;
  std::string b;
  std::string c;

  template <typename A, typename B, typename C>
  type(A&& a, B&& b, C&& c)
    : a{std::forward<A>(a)}
    , b{std::forward<B>(b)}
    , c{std::forward<C>(c)}
  {}
};

单击图像以获取快速基准链接

I'm currently learning C++ on my own, and I am curious about how push_back() and emplace_back() work under the hood. I've always assumed that emplace_back() is faster when you are trying to construct and push a large object to the back of a container, like a vector.

Let's suppose I have a Student object that I want to append to the back of a vector of Students.

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           /* initialize member variables */ { }
};

Suppose I call push_back() and push a Student object to the end of a vector:

vector<Student> vec;
vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));

My understanding here is that push_back creates an instance of the Student object outside of the vector and then moves it to the back of the vector.

Diagram:

I can also emplace instead of push:

vector<Student> vec;
vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);

My understanding here is that the Student object is constructed at the very back of the vector so that no moving is required.

Diagram:

Thus, it would make sense that emplacing would be faster, especially if many Student objects are added. However, when I timed these two versions of code:

for (int i = 0; i < 10000000; ++i) {
    vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));
}

and

for (int i = 0; i < 10000000; ++i) {
    vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);
}

I expected the latter to be faster, since the large Student object wouldn't have to be moved. Oddly enough, the emplace_back version ended up being slower (across multiple attempts). I also tried inserting 10000000 Student objects, where the constructor takes in references and the arguments in push_back() and emplace_back() are stored in variables. This also didn't work, as emplace was still slower.

I've checked to make sure that I'm inserting the same number of objects in both cases. The time difference isn't too large, but emplacing ended up slower by a few seconds.

Is there something wrong with my understanding of how push_back() and emplace_back() work? Thank you very much for your time!

Here's the code, as requested. I'm using the g++ compiler.

Push back:

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           name(name_in), student_ID(ID_in), GPA(GPA_in), 
           favorite_food(food_in), favorite_prof(prof_in),
           hours_slept(sleep_in), birthyear(birthyear_in) {}
};

int main() {
    vector<Student> vec;
    vec.reserve(10000000);
    for (int i = 0; i < 10000000; ++i) 
         vec.push_back(Student("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997));
    return 0;
}

Emplace back:

struct Student {
   string name;
   int student_ID;
   double GPA;
   string favorite_food;
   string favorite_prof;
   int hours_slept;
   int birthyear;
   Student(string name_in, int ID_in, double GPA_in, string food_in, 
           string prof_in, int sleep_in, int birthyear_in) :
           name(name_in), student_ID(ID_in), GPA(GPA_in), 
           favorite_food(food_in), favorite_prof(prof_in),
           hours_slept(sleep_in), birthyear(birthyear_in) {}
};

int main() {
    vector<Student> vec;
    vec.reserve(10000000);
    for (int i = 0; i < 10000000; ++i) 
         vec.emplace_back("Bob", 123456, 3.89, "pizza", "Smith", 7, 1997);
    return 0;
}

解决方案

This behavior is due to the complexity of std::string. There are a couple things interacting here:

  • The Small String Optimization (SSO)
  • In the push_back version, the compiler is able to determine the length of the string at compile-time, whereas the compiler was unable to do so for the emplace_back version. Thus, the emplace_back call requires calls to strlen. Furthermore, since the compiler doesn't know the length of the string literal, it has to emit code for both the SSO and non-SSO cases (see Jason Turner's "Initializer Lists Are Broken, Let's Fix Them"; it's a long talk, but he follows the problem of inserting strings into a vector throughout it)

Consider this simpler type:

struct type {
  std::string a;
  std::string b;
  std::string c;

  type(std::string a, std::string b, std::string c)
    : a{a}
    , b{b}
    , c{c}
  {}
};

Note how the constructor copies a, b, and c.

Testing this against a baseline of just allocating memory, we can see that push_back outperforms emplace_back:

Click on image for quick-bench link

Because the strings in your example all fit inside the SSO buffer, copying is just as cheap as moving in this case. Thus, the constructor is perfectly efficient, and the improvements from emplace_back have a smaller effect.

Also, if we search the assembly for both a call to push_back and a call to emplace_back:

// push_back call
void foo(std::vector<type>& vec) {
    vec.push_back({"Bob", "pizza", "Smith"});
}

// emplace_back call
void foo(std::vector<type>& vec) {
    vec.emplace_back("Bob", "pizza", "Smith");
}

(Assembly not copied here. It's massive. std::string is complicated)

We can see that emplace_back has calls to strlen, whereas push_back does not. Since the distance between the string literal and the std::string being constructed is increased, the compiler was unable to optimize out the call to strlen.

Explicitly calling the std::string constructor would remove the calls to strlen, but would no longer construct them in place, so that doesn't work to speed up emplace_back.

All this said, if we leave the SSO by using long enough strings, the allocation cost completely drowns out these details, so both emplace_back and push_back have the same performance:

Click on image for quick-bench link


If you fix the constructor of type to move its arguments, emplace_back becomes faster in all cases.

struct type {
  std::string a;
  std::string b;
  std::string c;

  type(std::string a, std::string b, std::string c)
    : a{std::move(a)}
    , b{std::move(b)}
    , c{std::move(c)}
  {}
};

SSO case

Click on image for quick-bench link

Long case

Click on image for quick-bench link

However, the SSO push_back case slowed down; the compiler seems to emit extra copies.

The optimal version of perfect forwarding does not suffer from this drawback (note the scale change on the vertical axis):

struct type {
  std::string a;
  std::string b;
  std::string c;

  template <typename A, typename B, typename C>
  type(A&& a, B&& b, C&& c)
    : a{std::forward<A>(a)}
    , b{std::forward<B>(b)}
    , c{std::forward<C>(c)}
  {}
};

Click on image for quick-bench link

这篇关于幕后的push_back()和emplace_back()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆