实现类似std :: vector的容器而没有未定义的行为 [英] Implementing a std::vector like container without undefined behavior

查看:111
本文介绍了实现类似std :: vector的容器而没有未定义的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能会使某些编码器感到惊讶,而且,令人惊讶的是,如果没有编译器的非标准支持,就不可能实现std::vector.问题本质上在于对原始存储区域执行指针算术的能力. @ShafikYaghmour答案中出现的论文 p0593:隐式创建用于低级对象操作的对象明确存在问题的地方,并提出对该标准进行修改,以使诸如容器之类的向量的实现和其他法律级别的编程技术更容易实现.

It may surprise some coders and, as surprising as it can be, it is not possible to implement std::vector without non-standard support of the compilers. The problem essentially resides on the ability to perform pointer arithmetic on a raw storage region. The paper, p0593: Implicit creation of objects for low-level object manipulation, that appears in @ShafikYaghmour answer, exposes clearly the problematic and proposes modification of the standard in order to make implementation of vector like container and other law level programming technique easier.

尽管如此,我想知道是否没有解决办法,仅使用语言提供的内容而不使用标准库来实现等同于std::vector的类型.

Nevertheless I was wondering if there were no work around to implement a type equivalent to std::vector only using what is provided by the language without any use of the standard library.

目标是在原始存储区域中一个接一个地构造矢量元素,并能够使用迭代器访问这些元素.这等效于std :: vector上的push_back序列.

The objective is to construct vector elements, one by one, in a raw storage region, and be able to access those elements using an iterator. This would be equivalent to a sequence of push_back on a std::vector.

要了解这个问题,请简化在libc ++或libstdc ++中对std::vector的实现所执行的操作:

To get an idea of the problem, bellow a simplification of the operations that are performed on the implementation of std::vector in the libc++ or libstdc++:

void access_value(std::string x);

std::string s1, s2, s3;
//allocation
auto p=static_cast<std::string*>(::operator new(10*sizeof(std::string)));

//push_back s1
new(p) std::string(s1);
access_value(*p);//undefined behavior, p is not a pointer to object

//push_back s2
new(p+1) std::string(s2);//undefined behavior
        //, pointer arithmetic but no array (neither implicit array of size 1)
access_value(*(p+1));//undefined behavior, p+1 is not a pointer to object

//push_back s2
new(p+2) std::string(s3);//undefined behavior
        //, pointer arithmetic but no array
access_value(*(p+2));//undefined behavior, p+2 is not a pointer to object

我的想法是使用永远不会初始化其成员的联合.

My idea is to use a union that never initialize its member.

//almost trivialy default constructible
template<class T>
union atdc{
  char _c;
  T value;
  atdc ()noexcept{ }
  ~atdc(){}
};

原始存储将使用此联合类型的数组初始化,并且指针运算始终在此数组上执行.然后,在每次push_back时,在联合的非活动成员上构造元素.

The raw storage will be initialized with an array of this union type, and pointer arithmetic is always performed on this array. Then elements are constructed on the inactive member of the union at each push_back.

std::string s1, s2, s3;
auto p=::operator new(10*sizeof(std::string));
auto arr = new(p) atdc<std::string>[10];
//pointer arithmetic on arr is allowed

//push_back s1
new(&arr[0].value) std::string(s1); //union member activation
access_value(arr[0].value);

//push_back s2
new(&arr[1].value) std::string(s2);
access_value(arr[1].value);

//push_back s2
new(&arr[2].value) std::string(s2);
access_value(arr[2].value);

上面的代码中是否存在未定义的行为?

Is there any undefined behavior in this code above?

推荐答案

该主题正在积极讨论中,我们可以在提案

This is topic that is under active discussion, we can see this in the proposal p0593: Implicit creation of objects for low-level object manipulation. This is a pretty solid discussion of the issues and why they are not fixable without changes. If you have different approaches or strong views on the approaches being considered you may want to reach out to the proposals authors.

它包含以下讨论:

2.3.动态构造数组

2.3. Dynamic construction of arrays

考虑一下该程序,该程序尝试实现std :: vector之类的类型(为简洁起见,省略了许多详细信息):

Consider this program that attempts to implement a type like std::vector (with many details omitted for brevity):

....

在实践中,此代码可在多种现有代码上运行 实现,但根据C ++对象模型,未定义 行为发生在#a,#b,#c,#d和#e点,因为它们试图 对分配的存储区域执行指针算术运算 不包含数组对象.

In practice, this code works across a range of existing implementations, but according to the C++ object model, undefined behavior occurs at points #a, #b, #c, #d, and #e, because they attempt to perform pointer arithmetic on a region of allocated storage that does not contain an array object.

在位置#b,#c和#d,算术运算是对char *进行的, 在#a,#e和#f位置,对T *执行算术运算. 理想情况下,解决此问题的方法将使两个计算都没有 定义的行为.

At locations #b, #c, and #d, the arithmetic is performed on a char*, and at locations #a, #e, and #f, the arithmetic is performed on a T*. Ideally, a solution to this problem would imbue both calculations with defined behavior.

  1. 方法

以上代码段有一个共同的主题:它们尝试使用从未创建过的对象.实际上,程序员认为不需要为它们显式创建对象而拥有一类类型.我们建议识别这些类型,并仔细制定规则,从而消除对显式创建此类对象的需求,而改为隐式创建它们.

The above snippets have a common theme: they attempt to use objects that they never created. Indeed, there is a family of types for which programmers assume they do not need to explicitly create objects. We propose to identify these types, and carefully carve out rules that remove the need to explicitly create such objects, by instead creating them implicitly.

使用 adc 联合的方法存在一个问题,我们希望能够通过指针T*即通过严格的别名规则,因此是未定义的行为.

The approach using the adc union has the problem that we expect to be able to access the contained data via a pointer T* i.e. via std::vector::data. Accessing the union as a T* would violate the strict aliasing rules and therefore be undefined behavior.

这篇关于实现类似std :: vector的容器而没有未定义的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆