GNU STL字符串:是否涉及copy-on-write? [英] GNU STL string: is copy-on-write involved here?

查看:170
本文介绍了GNU STL字符串:是否涉及copy-on-write?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(免责声明:我不知道C ++标准可能会对此说什么......我知道,我很可怕)



使用非常大的字符串我注意到std :: string正在使用copy-on-write。我设法编写最小的循环来重现观察到的行为,例如,可以快速运行的可疑行为:

 #包括< string> 
使用std :: string;
int main(void){
string basestr(1024 * 1024 * 10,'A');
for(int i = 0; i <100; i ++){
string a_copy = basestr;


在循环体<$ c中添加写入时$ c> a_copy [1] ='B'; ,显然发生了实际的复制,程序以0.3s运行,而不是几毫秒。 100篇文章减慢了大约100次。



但后来变得很奇怪。我的一些字符串没有写入,只能读取,而且这并未反映在执行时间内,这与字符串上的操作次数几乎成正比。通过一些挖掘,我发现简单地从字符串中读取数据仍然给我带来了性能问题,所以它让我假定GNU STL字符串使用了copy-on-read(?)。

  #include< string> 
使用std :: string;
int main(void){
string basestr(1024 * 1024 * 10,'A');
for(int i = 0; i <100; i ++){
string a_copy = basestr;
_copy [99]; //这也跑了0.3秒!


在发现了一段时间后,我发现从基础字符串读取(使用操作符[])的整个玩具程序也需要0.3s ..我不是100%满意的。 STL字符串确实是复制在读取的,还是允许写入时复制?我被引导认为运算符[]有一些保护措施来防止那些会保留它返回的引用并稍后写入它的人;这是真的吗?如果不是,那么究竟发生了什么?如果有人可以指向C ++标准中的某些相关部分,那也是值得赞赏的。

作为参考,我使用 g ++( Ubuntu的4.4.3-4ubuntu5)4.4.3 ,和GNU STL。

解决方案

C ++没有按不区分用于读写的运算符[] ,但只有运算符[] 用于const对象和mutable (非const)对象。由于 a_copy 是可变的,因此会选择可变的运算符[] ,这会强制复制,因为该运算符返回一个如果效率是一个问题,您可以将 a_copy 转换为 const string 强制使用操作符[] const 版本,不会复制内部缓冲区。

  char f = static_cast< const string>(a_copy)[99]; 


(Disclaimer: I don't know what the C++ standard might say about this..I know, I'm horrible)

while operating on very large strings I noticed that std::string is using copy-on-write. I managed to write the smallest loop that would reproduce the observed behaviour and the following one, for instance, runs suspiciously fast:

#include <string>
using std::string;
int main(void) {
    string basestr(1024 * 1024 * 10, 'A');
    for (int i = 0; i < 100; i++) {
        string a_copy = basestr;
    }
}

when adding a write in the loop body a_copy[1] = 'B';, an actual copy apparently took place, and the program ran in 0.3s instead of a few milliseconds. 100 writes slowed it down by about 100 times.

But then it got weird. Some of my strings weren't written to, only read from, and this was not reflected in the execution time, which was almost exactly proportional to the number of operations on the strings. With some digging, I found that simply reading from a string still gave me that performance hit, so it led me to assume GNU STL strings are using copy-on-read (?).

#include <string>
using std::string;
int main(void) {
    string basestr(1024 * 1024 * 10, 'A');
    for (int i = 0; i < 100; i++) {
        string a_copy = basestr;
        a_copy[99]; // this also ran in 0.3s!
    }
}

After revelling in my discovery for a while, I found out that reading (with operator[]) from the base string also takes 0.3s for the entire toy program..I'm not 100% comfortable with this. Are STL strings indeed copy-on-read, or are they allowing copy-on-write at all? I'm led to think that operator[] has some safeguards against one who would keep the reference it returns and later write to it; is this really the case? If not, what is really happening? If someone can point to some relevant section in the C++ standard, that'd also be appreciated.

For reference, I'm using g++ (Ubuntu 4.4.3-4ubuntu5) 4.4.3, and the GNU STL.

解决方案

C++ doesn't distinguish between the operator[] for reading and writing, but only the operator[] for const object and mutable (non-const) object. Since a_copy is mutable, the mutable operator[] will be chosen, which forces the copying because that operator returns a (mutable) reference.

If efficiency is a concern, you could cast the a_copy to a const string to force the const version of operator[] to be used, which won't make a copy of the internal buffer.

char f = static_cast<const string>(a_copy)[99];

这篇关于GNU STL字符串:是否涉及copy-on-write?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆