如何实现写时复制? [英] How to implement Copy-on-Write?

查看:120
本文介绍了如何实现写时复制?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在我的自定义C ++ String类上实现写时复制,我想知道如何...

I want to implement a copy-on-write on my custom C++ String class, and I wonder how to...

我尝试实现一些选项,

I tried to implement some options, but they all turned out very inefficient.

谢谢你们:-)

推荐答案

在多线程环境中(当今是大多数环境),CoW通常是对性能的巨大打击,而不是收益。而且,通过谨慎地使用const引用,即使在单线程环境中也不会带来很大的性能提升。

In a multi-threaded environemnt (which is most of them nowadays) CoW is frequently a huge performance hit rather than a gain. And with careful use of const references, it's not much of a performance gain even in a single threaded environment.

这篇关于DDJ的老文章解释了即使在只有一个线程的情况下,在多线程环境中CoW也有多糟糕

This old DDJ article explains just how bad CoW can be in a multithreaded environment, even if there's only one thread.

另外,正如其他人指出的那样,CoW字符串的实现确实很棘手,而且很容易出错。再加上它们在线程情况下的性能不佳,使我真的质疑它们的总体用途。一旦开始使用C ++ 11移动构造和移动分配,这一点就变得更加正确。

Additionally, as other people have pointed out, CoW strings are really tricky to implement, and it's easy to make mistakes. That coupled with their poor performance in threading situations makes me really question their usefulness in general. This becomes even more true once you start using C++11 move construction and move assignment.

但是,要回答您的问题...。

But, to answer your question....

以下是一些可能有助于提高性能的实现技术。

Here are a couple of implementation techniques that may help with performance.

首先,将长度存储在字符串本身中。长度访问非常频繁,消除指针取消引用可能会有所帮助。为了保持一致性,我也将分配的长度放在那里。就字符串对象而言,这将使您付出更大的代价,但是空间和复制时间的开销却很小,尤其是因为这些值将使编译器更容易发挥有趣的优化技巧。

First, store the length in the string itself. The length is accessed quite frequently and eliminating the pointer dereference would probably help. I would, just for consistency put the allocated length there too. This will cost you in terms of your string objects being a bit bigger, but the overhead there in space and copying time is very small, especially since these values will then become easier for the compiler to play interesting optimization tricks with.

这为您提供了一个类似于以下字符串类:

This leaves you with a string class that looks like this:

class MyString {
   ...
 private:
   class Buf {
      ...
    private:
      ::std::size_t refct_;
      char *data_;
   };

   ::std::size_t len_;
   ::std::size_t alloclen_;
   Buf *data_;
};

现在,您可以执行进一步的优化。那里的Buf类看起来实际上并没有包含很多内容,也没做太多,这是事实。此外,它需要同时分配Buf实例和缓冲区来容纳字符。这似乎很浪费。因此,我们将转向一种常见的C实现技术,即可伸缩的缓冲区:

Now, there are further optimizations you can perform. The Buf class there looks like it doesn't really contain or do much, and this is true. Additionally, it requires allocating both an instance of Buf and a buffer to hold the characters. This seems rather wasteful. So, we'll turn to a common C implementation technique, stretchy buffers:

class MyString {
   ...
 private:
   struct Buf {
      ::std::size_t refct_;
      char data_[1];
   };

   void resizeBufTo(::std::size_t newsize);
   void dereferenceBuf();

   ::std::size_t len_;
   ::std::size_t alloclen_;
   Buf *data_;
};

void MyString::resizeBufTo(::std::size_t newsize)
{
   assert((data_ == 0) || (data_->refct_ == 1));
   if (newsize != 0) {
      // Yes, I'm using C's allocation functions on purpose.
      // C++'s new is a poor match for stretchy buffers.
      Buf *newbuf = ::std::realloc(data_, sizeof(*newbuf) + (newsize - 1));
      if (newbuf == 0) {
         throw ::std::bad_alloc();
      } else {
         data_ = newbuf_;
      }
   } else { // newsize is 0
      if (data_ != 0) {
         ::std::free(data_);
         data_ = 0;
      }
   }
   alloclen_ = newsize;
}

以这种方式进行操作时,可以治疗 data _-> data _ 好像它包含 alloclen _ 个字节而不是仅仅1个字节。

When you do things this way, you can then treat data_->data_ as if it contained alloclen_ bytes instead of just 1.

请记住,在所有这些情况下,您都必须确保永远不要在多线程环境中使用它,或者确保 refct _ 是一种同时具有原子增量,原子减量和测试指令的类型。

Keep in mind that in all of these cases you will have to make sure that you either never ever use this in a multi-threaded environment, or that you make sure that refct_ is a type that you have both an atomic increment, and an atomic decrement and test instruction for.

还有一种更高级的优化技术,涉及使用联合来将短字符串存储在用来描述较长字符串的数据位中。但这甚至更加复杂,我不认为以后会编辑此示例以简化示例,但您永远无法分辨。

There is an even more advanced optimization technique that involves using a union to store short strings right inside the bits of data that you would use to describe a longer string. But that's even more complex, and I don't think I will feel inclined to edit this to put a simplified example here later, but you never can tell.

这篇关于如何实现写时复制?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆