为什么我的包含守卫不阻止递归包含和多个符号定义? [英] Why aren't my include guards preventing recursive inclusion and multiple symbol definitions?

查看:39
本文介绍了为什么我的包含守卫不阻止递归包含和多个符号定义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于包含守卫的两个常见问题:

  1. 第一个问题:

    为什么不包含保护保护我的头文件免于相互、递归包含?每次我编写类似以下内容时,我都会收到关于不存在的符号的错误,这些错误显然存在甚至更奇怪的语法错误:

    "a.h"

    #ifndef A_H#定义A_H#include "b.h"...#endif//A_H

    "b.h"

    #ifndef B_H#define B_H#include "a.h"...#endif//B_H

    "main.cpp"

    #include "a.h"int main(){...}

    为什么我在编译main.cpp"时会出错?我需要做什么来解决我的问题?

<小时>

  1. 第二个问题:

    为什么不包含防止多重定义的守卫?例如,当我的项目包含两个包含相同标头的文件时,有时链接器会抱怨某个符号被多次定义.例如:

    "header.h"

    #ifndef HEADER_H#define HEADER_Hint f(){返回0;}#endif//HEADER_H

    "source1.cpp"

    #include "header.h"...

    "source2.cpp"

    #include "header.h"...

    为什么会这样?我需要做什么来解决我的问题?

解决方案

第一个问题:

为什么不包含保护保护我的头文件免于相互、递归包含?

他们是.

他们没有帮助的是相互包含的标头中数据结构定义之间的依赖关系.要了解这意味着什么,让我们从一个基本场景开始,看看为什么包含守卫有助于相互包含.

假设您相互包含的 a.hb.h 头文件具有微不足道的内容,即问题文本中代码部分中的省略号被替换为空字符串.在这种情况下,您的 main.cpp 将愉快地编译.这要归功于你的包含守卫!

如果您不相信,请尝试删除它们:

//================================================//a.h#include "b.h"//================================================//b.h#include "a.h"//================================================//main.cpp////祝你好运编译...#include "a.h"int main(){...}

您会注意到编译器在达到包含深度限制时会报告失败.此限制是特定于实现的.根据 C++11 标准的第 16.2/6 段:

<块引用>

#include 预处理指令可能出现在已读取的源文件中,因为另一个文件中的 #include 指令达到实现定义的嵌套限制.

这是怎么回事?

  1. 解析main.cpp时,预处理器会遇到指令#include "a.h".该指令告诉预处理器处理头文件 a.h,获取该处理的结果,并将字符串 #include "a.h" 替换为该结果;
  2. 在处理ah时,预处理器会遇到指令#include "bh",同样的机制适用:预处理器处理头文件bh,获取其处理结果,并用该结果替换 #include 指令;
  3. 在处理bh 时,指令#include "ah" 会告诉预处理器处理ah 并用结果替换该指令;
  4. 预处理器将再次开始解析 a.h,再次遇到 #include "b.h" 指令,这将设置一个潜在的无限递归过程.当达到临界嵌套级别时,编译器会报错.

当包含守卫存在时,然而,在第 4 步中不会设置无限递归.让我们看看为什么:

  1. (同前) 解析main.cpp时,预处理器会遇到指令#include "a.h".这告诉预处理器处理头文件 a.h,获取该处理的结果,并将字符串 #include "a.h" 替换为该结果;
  2. 在处理a.h 时,预处理器将满足指令#ifndef A_H.由于宏 A_H 尚未定义,它将继续处理以下文本.随后的指令(#defines A_H)定义了宏A_H.然后,预处理器将满足指令#include "bh":预处理器现在应处理头文件bh,获取其处理结果,并替换#include 指令与该结果;
  3. 在处理b.h时,预处理器会遇到指令#ifndef B_H.由于宏 B_H 尚未定义,它将继续处理以下文本.随后的指令(#defines B_H)定义了宏B_H.然后,指令 #include "ah" 将告诉预处理器处理 ah 并替换 bh 中的 #include 指令code> 与预处理的结果 ah;
  4. 编译器将再次开始预处理a.h,并再次遇到#ifndef A_H 指令.但是,在前面的预处理过程中,宏A_H 已经定义.因此,编译器这次会跳过后面的文本,直到找到匹配的#endif指令,并且这次处理的输出是空字符串(假设#endif后面没有任何内容)> 当然,指令).因此,预处理器会将 bh 中的 #include "ah" 指令替换为空字符串,并将追溯执行,直到替换了原始的 #include<main.cpp 中的/code> 指令.

因此,包含防护确实可以防止相互包含.但是,它们无法帮助在相互包含的文件中解决类定义之间的依赖:

//================================================//a.h#ifndef A_H#定义A_H#include "b.h"结构A{};#endif//A_H//================================================//b.h#ifndef B_H#define B_H#include "a.h"结构体B{A* pA;};#endif//B_H//================================================//main.cpp////祝你好运编译...#include "a.h"int main(){...}

鉴于上述头文件,main.cpp 将无法编译.

<块引用>

为什么会这样?

要查看发生了什么,再次执行步骤 1-4 就足够了.

很容易看出前三个步骤和第四个步骤的大部分内容不受此更改的影响(只需通读它们即可确信).但是,在第 4 步结束时发生了一些不同的事情:在将 bh 中的 #include "ah" 指令替换为空字符串后,预处理器将开始解析bh,特别是B 的定义.不幸的是,B 的定义提到了 A 类,这在之前从未遇到过,正是因为 包含守卫!

声明一个以前没有声明过的类型的成员变量当然是一个错误,编译器会礼貌地指出这一点.

<块引用>

我需要做什么来解决我的问题?

您需要前向声明.

实际上,定义B类不需要A类的定义,因为指针A 被声明为成员变量,而不是 A 类型的对象.由于指针的大小是固定的,编译器不需要知道A 的确切布局,也不需要计算它的大小来正确定义类B.因此,在b.hforward-declareA 并让编译器知道它的存在就足够了:

//================================================//b.h#ifndef B_H#define B_H//A 的前向声明:不需要 #include "a.h"结构A;结构体B{A* pA;};#endif//B_H

您的 main.cpp 现在肯定可以编译了.几点说明:

  1. bh中用前向声明替换#include指令不仅打破了相互包含,还足以有效表达B on A:尽可能/实用地使用前向声明也被认为是一种良好的编程习惯,因为它有助于避免不必要的包含,从而减少整体编译时间.但是,消除相互包含后,main.cpp 将不得不修改为 #include ahbh>(如果需要后者的话),因为 bh 不再是间接的 #included 到 ah;
  2. 虽然类 A 的前向声明足以让编译器声明指向该类的指针(或在可接受不完整类型的任何其他上下文中使用它),但解引用指向 的指针A(例如调用成员函数)或计算其大小是对不完整类型的非法操作:如果需要,A的完整定义需要对编译器可用,这意味着必须包含定义它的头文件.这就是为什么类定义和它们的成员函数的实现通常被分成一个头文件和一个该类的实现文件(类模板是这个规则的一个例外):实现文件,从不#included 由项目中的其他文件,可以安全地#include 使定义可见的所有必要标题.另一方面,头文件不会#include 其他头文件除非他们真的需要这样做(例如,定义一个基类可见),并且将尽可能/可行地使用前向声明.

<块引用>

第二个问题:

为什么不包括防止多重定义的守卫?

他们是.

他们没有保护您免受在不同翻译单元中的多重定义.这也在这个问答中进行了解释;A 在 StackOverflow 上.

同样如此,请尝试移除包含守卫并编译以下修改后的 source1.cpp(或 source2.cpp,对于它而言):

//================================================//source1.cpp////祝你好运编译...#include "header.h"#include "header.h"int main(){...}

编译器肯定会在这里抱怨 f() 被重新定义.这很明显:它的定义被包含两次!但是,当 header.h 包含适当的包含保护时,上面的 source1.cpp 将毫无问题地编译.这是预期的.

尽管如此,即使存在包含守卫并且编译器不会再用错误消息来打扰您,链接器 仍会坚持这样一个事实,即在合并从source1.cppsource2.cpp 的编译,将拒绝生成您的可执行文件.

<块引用>

为什么会这样?

基本上,您项目中的每个.cpp 文件(在此上下文中的技术术语是翻译单元)都是单独且独立编译的.当解析一个.cpp文件时,预处理器会处理所有的#include指令并展开它遇到的所有宏调用,这个纯文本处理的输出将在输入到编译器以将其转换为目标代码.一旦编译器完成为一个翻译单元生成目标代码,它将继续下一个,并且在处理前一个翻译单元时遇到的所有宏定义都将被遗忘.

事实上,用n个翻译单元(.cpp文件)编译一个项目就像执行同一个程序(编译器)n次,每次使用不同的输入:同一程序的不同执行不会共享先前程序执行的状态.因此,每个翻译都是独立执行的,编译一个翻译单元时遇到的预处理器符号不会在编译其他翻译单元时被记住(如果你仔细想想,你会很容易意识到这实际上是一种理想的行为).

因此,即使包含守卫帮助您防止递归相互包含和冗余包含在一个翻译单元中的同一标题,但它们无法检测相同的定义是否包含在不同 翻译单元.

然而,当合并由项目的所有 .cpp 文件编译生成的目标代码时,链接器看到定义的相同符号超过一次,因为这违反了一个定义规则.根据 C++11 标准的第 3.2/3 段:

<块引用>

每个程序都应包含该程序中 odr 使用的每个 非内联 函数或变量的一个定义;无需诊断.定义可以显式地出现在程序中,可以在标准或用户定义的库中找到,或者(在适当的时候)它是隐式定义的(参见 12.1、12.4 和 12.8).应在每个使用 odr 的翻译单元中定义内联函数.

因此,链接器将发出错误并拒绝生成程序的可执行文件.

<块引用>

我需要做什么来解决我的问题?

如果你想将你的函数定义保存在一个由多个翻译单元#included的头文件中(注意,没有如果您的标题是 #included 只是由 one 翻译单元,则会出现问题,您需要使用 inline 关键字.

否则,您只需要在 header.h 中保留函数的声明,将其定义(主体)放入 one 中仅 .cpp 文件(这是经典方法).

inline 关键字代表对编译器的非绑定请求,以直接在调用站点内联函数体,而不是为常规函数调用设置堆栈框架.尽管编译器不必满足您的要求,inline 关键字确实成功地告诉链接器容忍多个符号定义.根据 C++11 标准的第 3.2/5 段:

<块引用>

一个类类型(第9条)、枚举类型(7.2)、具有外部链接的内联函数(7.1.2)可以有多个定义,类模板(第 14 条)、非静态函数模板(14.5.6)、类模板的静态数据成员(14.5.1.3)、类模板的成员函数(14.5.1.1),或某些模板的模板特化如果每个定义出现在不同的翻译单元中,并且定义满足以下要求,则程序中未指定参数 (14.7, 14.5.5) [...]

上面的段落基本上列出了所有通常放在头文件中的定义,因为它们可以安全地包含在多个翻译单元中.相反,所有具有外部链接的其他定义都属于源文件.

使用 static 关键字而不是 inline 关键字还可以通过为您的函数提供 内部链接,从而使每个翻译单元拥有一个私有的副本该函数(及其局部静态变量)的 em>.然而,这最终会导致一个更大的可执行文件,通常应该优先使用 inline.

实现与使用 static 关键字相同的结果的另一种方法是将函数 f() 放在未命名的命名空间中.根据 C++11 标准的第 3.5/4 段:

<块引用>

未命名命名空间或在未命名命名空间内直接或间接声明的命名空间具有内部链接.所有其他命名空间都有外部链接.如果名称为以下名称,则具有名称空间范围且未在上面给出内部链接的名称与封闭名称空间具有相同的链接:

——一个变量;或

——一个函数;或

——命名类(第 9 条),或在 typedef 声明中定义的未命名类,其中该类具有用于链接目的的 typedef 名称(7.1.3);或

——一个命名枚举(7.2),或一个在 typedef 声明中定义的未命名枚举,其中枚举具有用于链接目的的 typedef 名称(7.1.3);或

——属于具有链接的枚举的枚举器;或

——一个模板.

出于上述同样的原因,应该首选 inline 关键字.

Two common questions about include guards:

  1. FIRST QUESTION:

    Why aren't include guards protecting my header files from mutual, recursive inclusion? I keep getting errors about non-existing symbols which are obviously there or even weirder syntax errors every time I write something like the following:

    "a.h"

    #ifndef A_H
    #define A_H
    
    #include "b.h"
    
    ...
    
    #endif // A_H
    

    "b.h"

    #ifndef B_H
    #define B_H
    
    #include "a.h"
    
    ...
    
    #endif // B_H
    

    "main.cpp"

    #include "a.h"
    int main()
    {
        ...
    }
    

    Why do I get errors compiling "main.cpp"? What do I need to do to solve my problem?


  1. SECOND QUESTION:

    Why aren't include guards preventing multiple definitions? For instance, when my project contains two files that include the same header, sometimes the linker complains about some symbol being defined multiple times. For instance:

    "header.h"

    #ifndef HEADER_H
    #define HEADER_H
    
    int f()
    {
        return 0;
    }
    
    #endif // HEADER_H
    

    "source1.cpp"

    #include "header.h"
    ...
    

    "source2.cpp"

    #include "header.h"
    ...
    

    Why is this happening? What do I need to do to solve my problem?

解决方案

FIRST QUESTION:

Why aren't include guards protecting my header files from mutual, recursive inclusion?

They are.

What they are not helping with is dependencies between the definitions of data structures in mutually-including headers. To see what this means, let's start with a basic scenario and see why include guards do help with mutual inclusions.

Suppose your mutually including a.h and b.h header files have trivial content, i.e. the ellipses in the code sections from the question's text are replaced with the empty string. In this situation, your main.cpp will happily compile. And this is only thanks to your include guards!

If you're not convinced, try removing them:

//================================================
// a.h

#include "b.h"

//================================================
// b.h

#include "a.h"

//================================================
// main.cpp
//
// Good luck getting this to compile...

#include "a.h"
int main()
{
    ...
}

You'll notice that the compiler will report a failure when it reaches the inclusion depth limit. This limit is implementation-specific. Per Paragraph 16.2/6 of the C++11 Standard:

A #include preprocessing directive may appear in a source file that has been read because of a #include directive in another file, up to an implementation-defined nesting limit.

So what's going on?

  1. When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This directive tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
  2. While processing a.h, the preprocessor will meet the directive #include "b.h", and the same mechanism applies: the preprocessor shall process the header file b.h, take the result of its processing, and replace the #include directive with that result;
  3. When processing b.h, the directive #include "a.h" will tell the preprocessor to process a.h and replace that directive with the result;
  4. The preprocessor will start parsing a.h again, will meet the #include "b.h" directive again, and this will set up a potentially infinite recursive process. When reaching the critical nesting level, the compiler will report an error.

When include guards are present, however, no infinite recursion will be set up in step 4. Let's see why:

  1. (same as before) When parsing main.cpp, the preprocessor will meet the directive #include "a.h". This tells the preprocessor to process the header file a.h, take the result of that processing, and replace the string #include "a.h" with that result;
  2. While processing a.h, the preprocessor will meet the directive #ifndef A_H. Since the macro A_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines A_H) defines the macro A_H. Then, the preprocessor will meet the directive #include "b.h": the preprocessor shall now process the header file b.h, take the result of its processing, and replace the #include directive with that result;
  3. When processing b.h, the preprocessor will meet the directive #ifndef B_H. Since the macro B_H has not yet been defined, it will keep processing the following text. The subsequent directive (#defines B_H) defines the macro B_H. Then, the directive #include "a.h" will tell the preprocessor to process a.h and replace the #include directive in b.h with the result of preprocessing a.h;
  4. The compiler will start preprocessing a.h again, and meet the #ifndef A_H directive again. However, during previous preprocessing, macro A_H has been defined. Therefore, the compiler will skip the following text this time until the matching #endif directive is found, and the output of this processing is the empty string (supposing nothing follows the #endif directive, of course). The preprocessor will therefore replace the #include "a.h" directive in b.h with the empty string, and will trace back the execution until it replaces the original #include directive in main.cpp.

Thus, include guards do protect against mutual inclusion. However, they can't help with dependencies between the definitions of your classes in mutually-including files:

//================================================
// a.h

#ifndef A_H
#define A_H

#include "b.h"

struct A
{
};

#endif // A_H

//================================================
// b.h

#ifndef B_H
#define B_H

#include "a.h"

struct B
{
    A* pA;
};

#endif // B_H

//================================================
// main.cpp
//
// Good luck getting this to compile...

#include "a.h"
int main()
{
    ...
}

Given the above headers, main.cpp will not compile.

Why is this happening?

To see what's going on, it is enough to go through steps 1-4 again.

It is easy to see that the first three steps and most of the fourth step are unaffected by this change (just read through them to get convinced). However, something different happens at the end of step 4: after replacing the #include "a.h" directive in b.h with the empty string, the preprocessor will start parsing the content of b.h and, in particular, the definition of B. Unfortunately, the definition of B mentions class A, which has never been met before exactly because of the inclusion guards!

Declaring a member variable of a type which has not been previously declared is, of course, an error, and the compiler will politely point that out.

What do I need to do to solve my problem?

You need forward declarations.

In fact, the definition of class A is not required in order to define class B, because a pointer to A is being declared as a member variable, and not an object of type A. Since pointers have fixed size, the compiler won't need to know the exact layout of A nor to compute its size in order to properly define class B. Hence, it is enough to forward-declare class A in b.h and make the compiler aware of its existence:

//================================================
// b.h

#ifndef B_H
#define B_H

// Forward declaration of A: no need to #include "a.h"
struct A;

struct B
{
    A* pA;
};

#endif // B_H

Your main.cpp will now certainly compile. A couple of remarks:

  1. Not only breaking the mutual inclusion by replacing the #include directive with a forward declaration in b.h was enough to effectively express the dependency of B on A: using forward declarations whenever possible/practical is also considered to be a good programming practice, because it helps avoiding unnecessary inclusions, thus reducing the overall compilation time. However, after eliminating the mutual inclusion, main.cpp will have to be modified to #include both a.h and b.h (if the latter is needed at all), because b.h is no more indirectly #included through a.h;
  2. While a forward declaration of class A is enough for the compiler to declare pointers to that class (or to use it in any other context where incomplete types are acceptable), dereferencing pointers to A (for instance to invoke a member function) or computing its size are illegal operations on incomplete types: if that is needed, the full definition of A needs to be available to the compiler, which means the header file that defines it must be included. This is why class definitions and the implementation of their member functions are usually split into a header file and an implementation file for that class (class templates are an exception to this rule): implementation files, which are never #included by other files in the project, can safely #include all the necessary headers to make definitions visible. Header files, on the other hand, won't #include other header files unless they really need to do so (for instance, to make the definition of a base class visible), and will use forward-declarations whenever possible/practical.

SECOND QUESTION:

Why aren't include guards preventing multiple definitions?

They are.

What they are not protecting you from is multiple definitions in separate translation units. This is also explained in this Q&A on StackOverflow.

Too see that, try removing the include guards and compiling the following, modified version of source1.cpp (or source2.cpp, for what it matters):

//================================================
// source1.cpp
//
// Good luck getting this to compile...

#include "header.h"
#include "header.h"

int main()
{
    ...
}

The compiler will certainly complain here about f() being redefined. That's obvious: its definition is being included twice! However, the above source1.cpp will compile without problems when header.h contains the proper include guards. That's expected.

Still, even when the include guards are present and the compiler will stop bothering you with error message, the linker will insist on the fact that multiple definitions being found when merging the object code obtained from the compilation of source1.cpp and source2.cpp, and will refuse to generate your executable.

Why is this happening?

Basically, each .cpp file (the technical term in this context is translation unit) in your project is compiled separately and independently. When parsing a .cpp file, the preprocessor will process all the #include directives and expand all macro invocations it encounters, and the output of this pure text processing will be given in input to the compiler for translating it into object code. Once the compiler is done with producing the object code for one translation unit, it will proceed with the next one, and all the macro definitions that have been encountered while processing the previous translation unit will be forgotten.

In fact, compiling a project with n translation units (.cpp files) is like executing the same program (the compiler) n times, each time with a different input: different executions of the same program won't share the state of the previous program execution(s). Thus, each translation is performed independently and the preprocessor symbols encountered while compiling one translation unit will not be remembered when compiling other translation units (if you think about it for a moment, you will easily realize that this is actually a desirable behavior).

Therefore, even though include guards help you preventing recursive mutual inclusions and redundant inclusions of the same header in one translation unit, they can't detect whether the same definition is included in different translation unit.

Yet, when merging the object code generated from the compilation of all the .cpp files of your project, the linker will see that the same symbol is defined more than once, and since this violates the One Definition Rule. Per Paragraph 3.2/3 of the C++11 Standard:

Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8). An inline function shall be defined in every translation unit in which it is odr-used.

Hence, the linker will emit an error and refuse to generate the executable of your program.

What do I need to do to solve my problem?

If you want to keep your function definition in a header file that is #included by multiple translation units (notice, that no problem will arise if your header is #included just by one translation unit), you need to use the inline keyword.

Otherwise, you need to keep only the declaration of your function in header.h, putting its definition (body) into one separate .cpp file only (this is the classical approach).

The inline keyword represents a non-binding request to the compiler to inline the function's body directly at the call site, rather than setting up a stack frame for a regular function call. Although the compiler doesn't have to fulfill your request, the inline keyword does succeed in telling the linker to tolerate multiple symbol definitions. According to Paragraph 3.2/5 of the C++11 Standard:

There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements [...]

The above Paragraph basically lists all the definitions which are commonly put in header files, because they can be safely included in multiple translation units. All other definitions with external linkage, instead, belong in source files.

Using the static keyword instead of the inline keyword also results in suppressing linker errors by giving your function internal linkage, thus making each translation unit hold a private copy of that function (and of its local static variables). However, this eventually results in a larger executable, and the use of inline should be preferred in general.

An alternative way of achieving the same result as with the static keyword is to put function f() in an unnamed namespace. Per Paragraph 3.5/4 of the C++11 Standard:

An unnamed namespace or a namespace declared directly or indirectly within an unnamed namespace has internal linkage. All other namespaces have external linkage. A name having namespace scope that has not been given internal linkage above has the same linkage as the enclosing namespace if it is the name of:

— a variable; or

a function; or

— a named class (Clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes (7.1.3); or

— a named enumeration (7.2), or an unnamed enumeration defined in a typedef declaration in which the enumeration has the typedef name for linkage purposes (7.1.3); or

— an enumerator belonging to an enumeration with linkage; or

— a template.

For the same reason mentioned above, the inline keyword should be preferred.

这篇关于为什么我的包含守卫不阻止递归包含和多个符号定义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆