什么是聚合和 POD 以及它们如何/为什么特别? [英] What are Aggregates and PODs and how/why are they special?

查看:33
本文介绍了什么是聚合和 POD 以及它们如何/为什么特别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

常见问题解答是关于聚合和 POD 的,涵盖以下材料:

  • 什么是聚合?
  • 什么是POD(普通旧数据)?
  • 它们有什么关系?
  • 它们有何特别之处?为何如此?
  • C++11 有哪些变化?

解决方案

阅读方法:

这篇文章比较长.如果您想了解聚合和 POD(普通旧数据),请花点时间阅读它.如果您只对聚合感兴趣,请仅阅读第一部分.如果您只对 POD 感兴趣,那么您必须首先阅读聚合的定义、含义和示例,然后您可能跳到 POD,但我仍然建议您完整阅读第一部分.聚合的概念对于定义 POD 至关重要.如果您发现任何错误(即使是轻微的,包括语法、文体、格式、语法等),请发表评论,我会编辑.

此答案适用于 C++03.有关其他 C++ 标准,请参阅:

什么是聚合以及它们为什么特别

来自 C++ 标准的正式定义 (C++03 8.5.1 §1):

<块引用>

聚合是没有用户声明的数组或类(第 9 条)构造函数(12.1),没有私有或受保护的非静态数据成员(第 11 条),没有基类(第 10 条),也没有虚函数(10.3).

那么,好吧,让我们解析这个定义.首先,任何数组都是一个聚合.一个类也可以是一个集合,如果……等等!没有说结构体或联合体,它们不能是聚合体吗?是的他们可以.在 C++ 中,术语 class 指的是所有类、结构和联合.因此,一个类(或结构或联合)是一个聚合,当且仅当它满足上述定义的标准.这些标准意味着什么?

  • 这并不意味着聚合类不能有构造函数,实际上它可以有默认构造函数和/或复制构造函数,只要它们是由编译器隐式声明的,而不是由用户显式声明的

  • 没有私有或受保护的非静态数据成员.您可以拥有任意数量的私有和受保护成员函数(但不能是构造函数)以及任意数量的私有或受保护 静态 数据成员和成员函数,而不会违反聚合类的规则

  • 聚合类可以具有用户声明/用户定义的复制赋值运算符和/或析构函数

  • 一个数组是一个聚合,即使它是一个非聚合类类型的数组.

现在让我们看一些例子:

class NotAggregate1{virtual void f() {}//还记得吗?没有虚函数};类 NotAggregate2{整数 x;//x 默认为私有且非静态};类 NotAggregate3{民众:NotAggregate3(int) {}//oops,用户定义的构造函数};类聚合1{民众:NotAggregate1 member1;//好的,公共成员聚合1&operator=(Aggregate1 const & rhs) {/* */}//ok,复制赋值私人的:void f() {}//好的,只是一个私有函数};

你懂的.现在让我们看看聚合是如何特殊的.与非聚合类不同,它们可以用花括号 {} 初始化.这种初始化语法在数组中是众所周知的,我们刚刚了解到这些是聚合.那么,让我们从它们开始吧.

Type array_name[n] = {a1, a2, ..., am};

if(m == n)
数组的第 ith 个元素用 i
初始化else if(m
数组的前 m 个元素初始化为 a1, a2, ..., am 和其他 n - m 元素,如果可能,值初始化(见下文对该术语的解释)
else if(m > n)
编译器会报错
else (这是完全没有指定 n 的情况,例如 int a[] = {1, 2, 3};)
假设数组(n)的大小等于m,所以int a[] = {1, 2, 3};等价于int a[3] = {1, 2, 3};

当一个标量类型的对象(boolintchardouble、指针等)) 是 value-initialized 这意味着它被初始化为 0 用于该类型(false 用于 bool, 0.0 表示 double 等).当具有用户声明的默认构造函数的类类型对象进行值初始化时,将调用其默认构造函数.如果默认构造函数是隐式定义的,则所有非静态成员都以递归方式进行值初始化.这个定义不精确,有点不正确,但它应该给你基本的概念.不能对引用进行值初始化.例如,如果类没有合适的默认构造函数,则非聚合类的值初始化可能会失败.

数组初始化示例:

A 类{民众:A(int) {}//无默认构造函数};B级{民众:B() {}//默认构造函数可用};int main(){A a1[3] = {A(2), A(1), A(14)};//确定 n == mA a2[3] = {A(2)};//错误 A 没有默认构造函数.无法对 a2[1] 和 a2[2] 进行值初始化B b1[3] = {B()};//OK b1[1] 和 b1[2] 是值初始化的,在这种情况下使用 default-ctorint Array1[1000] = {0};//所有元素都初始化为0;int Array2[1000] = {1};//注意:只有第一个元素是1,其余都是0;bool Array3[1000] = {};//大括号也可以为空.所有元素都用 false 初始化int Array4[1000];//没有初始化程序.这与空的 {} 初始值设定项的不同之处在于//这种情况下的元素不是值初始化的,而是具有不确定的值//(当然,除非 Array4 是一个全局数组)整数数组[2] = {1, 2, 3, 4};//错误,初始化器太多}

现在让我们看看如何用大括号初始化聚合类.几乎相同的方式.我们将按照在类定义中出现的顺序初始化非静态数据成员,而不是数组元素(根据定义,它们都是公共的).如果初始值设定项少于成员数,则其余的都是值初始化的.如果无法对未显式初始化的成员之一进行值初始化,则会出现编译时错误.如果初始值设定项过多,我们也会收到编译时错误.

struct X{国际 i1;国际 i2;};结构体{字符 c;××;国际我[2];浮动 f;受保护:静态双d;私人的:空 g(){}};y y = {'a', {10, 20}, {20, 30}};

在上面的例子中yc'a'初始化,yxi110>yxi2 with 20, yi[0] with 20, yi[1] with30yf 是值初始化的,即用 0.0 初始化.受保护的静态成员 d 根本没有初始化,因为它是 static.

聚合联合的不同之处在于您可以只用大括号初始化它们的第一个成员.我认为如果你在 C++ 方面足够先进,甚至可以考虑使用联合(它们的使用可能非常危险,必须仔细考虑),你可以自己在标准中查找联合规则:)

既然我们知道了聚合的特别之处,让我们试着理解对类的限制;也就是说,他们为什么在那里.我们应该明白,带大括号的成员初始化意味着该类只不过是其成员的总和.如果存在用户定义的构造函数,则意味着用户需要做一些额外的工作来初始化成员,因此大括号初始化将是不正确的.如果存在虚函数,则意味着该类的对象(在大多数实现中)具有指向该类的所谓 vtable 的指针,该指针是在构造函数中设置的,因此大括号初始化是不够的.您可以通过与练习类似的方式找出其余的限制:)

关于聚合就足够了.现在我们可以定义一组更严格的类型,即 PODs

什么是 POD 以及它们的特殊性

来自 C++ 标准的正式定义 (C++03 9 §4):

<块引用>

POD-struct 是一个聚合类没有非静态数据成员的输入非 POD 结构、非 POD 联合(或此类类型的数组)或引用,以及没有用户定义的副本分配操作员,没有用户定义析构函数.类似地,一个 POD 联合是一个聚合联合,没有类型的非静态数据成员非 POD 结构、非 POD 联合(或此类类型的数组)或引用,以及没有用户定义的副本分配操作员,没有用户定义析构函数.POD 类是一个类那要么是一个 POD 结构要么是一个POD-联合.

哇,这个更难解析,不是吗?:) 让我们把工会排除在外(基于与上述相同的理由)并以更清晰的方式重新表述:

<块引用>

聚合类称为 POD,如果它没有用户定义的复制分配运算符和析构函数,没有它的非静态成员是非 POD类、非 POD 数组或参考.

这个定义意味着什么?(我有没有提到 POD 代表 Plain Old Data?)

  • 所有的 POD 类都是聚合,或者反过来说,如果一个类不是聚合,那么它肯定不是 POD
  • 类,就像结构体一样,可以是 POD,即使标准术语在这两种情况下都是 POD-struct
  • 就像在聚合的情况下一样,类具有哪些静态成员并不重要

示例:

struct POD{整数 x;字符 y;void f() {}//如果有函数就没有坏处静态 std::vectorv;//静态成员无所谓};struct AggregateButNotPOD1{整数 x;~AggregateButNotPOD1() {}//用户定义的析构函数};struct AggregateButNotPOD2{AggregateButNotPOD1 arrOfNonPod[3];//非POD类数组};

POD 类、POD 联合、标量类型和此类类型的数组统称为 POD 类型.
POD 在很多方面都很特别.我将仅提供一些示例.

  • POD 类最接近 C 结构体.与它们不同的是,POD 可以拥有成员函数和任意静态成员,但这两者都不会改变对象的内存布局.因此,如果您想编写一个可以从 C 甚至 .NET 中使用的或多或少可移植的动态库,您应该尝试让所有导出的函数只接受和返回 POD 类型的参数.

  • 非 POD 类类型的对象的生命周期在构造函数完成时开始,在析构函数完成时结束.对于 POD 类,生命周期从对象的存储被占用时开始,到存储被释放或重用时结束.

  • 对于 POD 类型的对象,标准保证,当您 memcpy 将对象的内容转换为 char 或 unsigned char 数组,然后 memcpy 将内容放回到您的对象中,该对象将保持其原始值.请注意,对于非 POD 类型的对象没有这样的保证.此外,您可以使用 memcpy 安全地复制 POD 对象.以下示例假设 T 是 POD 类型:

     #define N sizeof(T)字符缓冲区[N];T obj;//obj 初始化为其原始值memcpy(buf, &obj, N);//在这两次调用 memcpy 之间,//obj 可能被修改memcpy(&obj, buf, N);//此时,标量类型的obj的每个子对象//保持它的原始值

  • goto 语句.您可能知道,通过 goto 从某个变量尚未在范围内的点跳转到它已经在范围内的点是非法的(编译器应该发出错误).此限制仅适用于变量为非 POD 类型的情况.在下面的例子中,f() 是非良构的,而 g() 是良构的.请注意,Microsoft 的编译器对这条规则过于宽松——它只是在两种情况下都发出警告.

     int f(){struct NonPOD {NonPOD() {}};转到标签;非POD x;标签:返回0;}整数 g(){struct POD {int i;字符 c;};转到标签;荚x;标签:返回0;}

  • 保证在一个POD对象的开头不会有padding.换句话说,如果一个 POD 类 A 的第一个成员是 T 类型,你可以安全地reinterpret_castA*T* 并得到指向第一个成员的指针,反之亦然.

这个清单还在继续……

结论

了解 POD 究竟是什么很重要,因为如您所见,许多语言功能对它们的表现各不相同.

This FAQ is about Aggregates and PODs and covers the following material:

  • What are Aggregates?
  • What are PODs (Plain Old Data)?
  • How are they related?
  • How and why are they special?
  • What changes for C++11?

解决方案

How to read:

This article is rather long. If you want to know about both aggregates and PODs (Plain Old Data) take time and read it. If you are interested just in aggregates, read only the first part. If you are interested only in PODs then you must first read the definition, implications, and examples of aggregates and then you may jump to PODs but I would still recommend reading the first part in its entirety. The notion of aggregates is essential for defining PODs. If you find any errors (even minor, including grammar, stylistics, formatting, syntax, etc.) please leave a comment, I'll edit.

This answer applies to C++03. For other C++ standards see:

What are aggregates and why they are special

Formal definition from the C++ standard (C++03 8.5.1 §1):

An aggregate is an array or a class (clause 9) with no user-declared constructors (12.1), no private or protected non-static data members (clause 11), no base classes (clause 10), and no virtual functions (10.3).

So, OK, let's parse this definition. First of all, any array is an aggregate. A class can also be an aggregate if… wait! nothing is said about structs or unions, can't they be aggregates? Yes, they can. In C++, the term class refers to all classes, structs, and unions. So, a class (or struct, or union) is an aggregate if and only if it satisfies the criteria from the above definitions. What do these criteria imply?

  • This does not mean an aggregate class cannot have constructors, in fact it can have a default constructor and/or a copy constructor as long as they are implicitly declared by the compiler, and not explicitly by the user

  • No private or protected non-static data members. You can have as many private and protected member functions (but not constructors) as well as as many private or protected static data members and member functions as you like and not violate the rules for aggregate classes

  • An aggregate class can have a user-declared/user-defined copy-assignment operator and/or destructor

  • An array is an aggregate even if it is an array of non-aggregate class type.

Now let's look at some examples:

class NotAggregate1
{
  virtual void f() {} //remember? no virtual functions
};

class NotAggregate2
{
  int x; //x is private by default and non-static 
};

class NotAggregate3
{
public:
  NotAggregate3(int) {} //oops, user-defined constructor
};

class Aggregate1
{
public:
  NotAggregate1 member1;   //ok, public member
  Aggregate1& operator=(Aggregate1 const & rhs) {/* */} //ok, copy-assignment  
private:
  void f() {} // ok, just a private function
};

You get the idea. Now let's see how aggregates are special. They, unlike non-aggregate classes, can be initialized with curly braces {}. This initialization syntax is commonly known for arrays, and we just learnt that these are aggregates. So, let's start with them.

Type array_name[n] = {a1, a2, …, am};

if(m == n)
the ith element of the array is initialized with ai
else if(m < n)
the first m elements of the array are initialized with a1, a2, …, am and the other n - m elements are, if possible, value-initialized (see below for the explanation of the term)
else if(m > n)
the compiler will issue an error
else (this is the case when n isn't specified at all like int a[] = {1, 2, 3};)
the size of the array (n) is assumed to be equal to m, so int a[] = {1, 2, 3}; is equivalent to int a[3] = {1, 2, 3};

When an object of scalar type (bool, int, char, double, pointers, etc.) is value-initialized it means it is initialized with 0 for that type (false for bool, 0.0 for double, etc.). When an object of class type with a user-declared default constructor is value-initialized its default constructor is called. If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized. This definition is imprecise and a bit incorrect but it should give you the basic idea. A reference cannot be value-initialized. Value-initialization for a non-aggregate class can fail if, for example, the class has no appropriate default constructor.

Examples of array initialization:

class A
{
public:
  A(int) {} //no default constructor
};
class B
{
public:
  B() {} //default constructor available
};
int main()
{
  A a1[3] = {A(2), A(1), A(14)}; //OK n == m
  A a2[3] = {A(2)}; //ERROR A has no default constructor. Unable to value-initialize a2[1] and a2[2]
  B b1[3] = {B()}; //OK b1[1] and b1[2] are value initialized, in this case with the default-ctor
  int Array1[1000] = {0}; //All elements are initialized with 0;
  int Array2[1000] = {1}; //Attention: only the first element is 1, the rest are 0;
  bool Array3[1000] = {}; //the braces can be empty too. All elements initialized with false
  int Array4[1000]; //no initializer. This is different from an empty {} initializer in that
  //the elements in this case are not value-initialized, but have indeterminate values 
  //(unless, of course, Array4 is a global array)
  int array[2] = {1, 2, 3, 4}; //ERROR, too many initializers
}

Now let's see how aggregate classes can be initialized with braces. Pretty much the same way. Instead of the array elements we will initialize the non-static data members in the order of their appearance in the class definition (they are all public by definition). If there are fewer initializers than members, the rest are value-initialized. If it is impossible to value-initialize one of the members which were not explicitly initialized, we get a compile-time error. If there are more initializers than necessary, we get a compile-time error as well.

struct X
{
  int i1;
  int i2;
};
struct Y
{
  char c;
  X x;
  int i[2];
  float f; 
protected:
  static double d;
private:
  void g(){}      
}; 

Y y = {'a', {10, 20}, {20, 30}};

In the above example y.c is initialized with 'a', y.x.i1 with 10, y.x.i2 with 20, y.i[0] with 20, y.i[1] with 30 and y.f is value-initialized, that is, initialized with 0.0. The protected static member d is not initialized at all, because it is static.

Aggregate unions are different in that you may initialize only their first member with braces. I think that if you are advanced enough in C++ to even consider using unions (their use may be very dangerous and must be thought of carefully), you could look up the rules for unions in the standard yourself :).

Now that we know what's special about aggregates, let's try to understand the restrictions on classes; that is, why they are there. We should understand that memberwise initialization with braces implies that the class is nothing more than the sum of its members. If a user-defined constructor is present, it means that the user needs to do some extra work to initialize the members therefore brace initialization would be incorrect. If virtual functions are present, it means that the objects of this class have (on most implementations) a pointer to the so-called vtable of the class, which is set in the constructor, so brace-initialization would be insufficient. You could figure out the rest of the restrictions in a similar manner as an exercise :).

So enough about the aggregates. Now we can define a stricter set of types, to wit, PODs

What are PODs and why they are special

Formal definition from the C++ standard (C++03 9 §4):

A POD-struct is an aggregate class that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. Similarly, a POD-union is an aggregate union that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. A POD class is a class that is either a POD-struct or a POD-union.

Wow, this one's tougher to parse, isn't it? :) Let's leave unions out (on the same grounds as above) and rephrase in a bit clearer way:

An aggregate class is called a POD if it has no user-defined copy-assignment operator and destructor and none of its nonstatic members is a non-POD class, array of non-POD, or a reference.

What does this definition imply? (Did I mention POD stands for Plain Old Data?)

  • All POD classes are aggregates, or, to put it the other way around, if a class is not an aggregate then it is sure not a POD
  • Classes, just like structs, can be PODs even though the standard term is POD-struct for both cases
  • Just like in the case of aggregates, it doesn't matter what static members the class has

Examples:

struct POD
{
  int x;
  char y;
  void f() {} //no harm if there's a function
  static std::vector<char> v; //static members do not matter
};

struct AggregateButNotPOD1
{
  int x;
  ~AggregateButNotPOD1() {} //user-defined destructor
};

struct AggregateButNotPOD2
{
  AggregateButNotPOD1 arrOfNonPod[3]; //array of non-POD class
};

POD-classes, POD-unions, scalar types, and arrays of such types are collectively called POD-types.
PODs are special in many ways. I'll provide just some examples.

  • POD-classes are the closest to C structs. Unlike them, PODs can have member functions and arbitrary static members, but neither of these two change the memory layout of the object. So if you want to write a more or less portable dynamic library that can be used from C and even .NET, you should try to make all your exported functions take and return only parameters of POD-types.

  • The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.

  • For objects of POD types it is guaranteed by the standard that when you memcpy the contents of your object into an array of char or unsigned char, and then memcpy the contents back into your object, the object will hold its original value. Do note that there is no such guarantee for objects of non-POD types. Also, you can safely copy POD objects with memcpy. The following example assumes T is a POD-type:

     #define N sizeof(T)
     char buf[N];
     T obj; // obj initialized to its original value
     memcpy(buf, &obj, N); // between these two calls to memcpy,
     // obj might be modified
     memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type
     // holds its original value
    

  • goto statement. As you may know, it is illegal (the compiler should issue an error) to make a jump via goto from a point where some variable was not yet in scope to a point where it is already in scope. This restriction applies only if the variable is of non-POD type. In the following example f() is ill-formed whereas g() is well-formed. Note that Microsoft's compiler is too liberal with this rule—it just issues a warning in both cases.

     int f()
     {
       struct NonPOD {NonPOD() {}};
       goto label;
       NonPOD x;
     label:
       return 0;
     }
    
     int g()
     {
       struct POD {int i; char c;};
       goto label;
       POD x;
     label:
       return 0;
     }
    

  • It is guaranteed that there will be no padding in the beginning of a POD object. In other words, if a POD-class A's first member is of type T, you can safely reinterpret_cast from A* to T* and get the pointer to the first member and vice versa.

The list goes on and on…

Conclusion

It is important to understand what exactly a POD is because many language features, as you see, behave differently for them.

这篇关于什么是聚合和 POD 以及它们如何/为什么特别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆