在C和C ++中通过编译器在对象文件中包含未使用的符号 [英] Inclusion of unused symbols in object files by compiler in C vs C++

查看:113
本文介绍了在C和C ++中通过编译器在对象文件中包含未使用的符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能是一个愚蠢的问题,但也许有人可以提供一些洞察。



我有一些全局变量定义在头文件(是的,我知道那是坏的,但这只是一种假设情况)。我将这个头文件包括在两个源文件中,然后将它们编译成两个目标文件。



如果源文件是C,那么它看起来就像编译器省略了全局符号,一切链接没有错误。如果源文件是C ++,符号将包含在这两个目标文件中,然后我得到链接器错误。对于C ++我在使用头文件时使用externC。



我使用的是VS2005的Microsoft编译器。



这是我的代码:



头文件(test.h):

  #ifndef __TEST_H 
#define __TEST_H

/ *头文件中的声明* /
void * ptr;

#endif

C源文件:



test1.c

  #includetest.h

int main(){
return 0;
}

test2.c

  #includetest.h



test1.cpp

  externC{
#includetest.h
}

int main(){
return 0;
}

test2.cpp

  externC{
#includetest.h
}

对于C,目标文件看起来像这样:

  .obj 

文件类型:COFF OBJECT

COFF符号表
000 006DC627 ABS不匹配Static | @ comp.id
001 00000001 ABS notype Static | @ feat.00
002 00000000 SECT1 notype Static | .drectve
段长度2F,#relocs 0,#linenums 0,校验和0
004 00000000 SECT2 notype Static | .debug $ S
段长度228,#relocs 7,#linenums 0,校验和0
006 00000004 UNDEF notype External | _ptr
007 00000000 SECT3 notype Static | .text
段长度7,#relocs 0,#linenums 0,校验和96F779C9
009 00000000 SECT3 notype()外部| _main
00A 00000000 SECT4 notype Static | .debug $ T
节长度1C,#relocs 0,#linenums 0,校验和0

字符串表大小= 0x0字节

对于C ++,它们看起来像这样:

 文件test1.obj 

文件类型:COFF OBJECT

COFF符号表
000 006EC627 ABS不匹配Static | @ comp.id
001 00000001 ABS notype Static | @ feat.00
002 00000000 SECT1 notype Static | .drectve
段长度2F,#relocs 0,#linenums 0,校验和0
004 00000000 SECT2 notype Static | .debug $ S
段长度228,#relocs 7,#linenums 0,校验和0
006 00000000 SECT3 notype Static | .bss
段长度4,#relocs 0,#linenums 0,校验和0
008 00000000 SECT3 notype外部| _ptr
009 00000000 SECT4 notype Static | .text
段长度7,#relocs 0,#linenums 0,校验和96F779C9
00B 00000000 SECT4 notype()外部| _main
00C 00000000 SECT5 notype Static | .debug $ T
节长度1C,#relocs 0,#linenums 0,校验和0

字符串表大小= 0x0字节

我注意到,当我编译C源时,_ptr被列为UNDEF,当我编译C ++源时,它被定义,导致链接器错误。



我知道这在现实生活中不是一件好事,我只是想了解为什么这是不同的。



感谢。

解决方案

在C中,标识符有三种不同类型的链接:


  1. 外部链接:粗略地说,这是人们用全局变量来表示的意思。
  2. :这些是用 static声明的对象。 :没有链接:这些是临时或自动的对象,例如在函数(通常称为局部变量)。

对于具有外部链接的对象, em>定义。因为你的头文件定义了这样一个对象,并被包含在两个C文件中,它是未定义的行为(但见下文)。你的C编译器不抱怨的事实并不意味着在C中这样做是可以的。为此,你必须阅读C标准。 (或者,假设在编译器中没有错误,如果它在符合标准的模式下被调用,并且如果它抱怨某事[给出诊断],则可能意味着你的程序不合规。)



换句话说,你不能通过测试一些东西来测试语言允许的内容,并检查你的编译器是否允许它。为此,您必须阅读标准。



请注意,定义和暂定定义之间有微妙的区别。

  $ cat ac 
int x = 0;
$ cat b.c
#include< stdio.h>
int x = 0;
int main(void)
{
printf(%d \\\
,x);
return 0;
}
$ gcc -ansi -pedantic -W -Wall -c ac
$ gcc -ansi -pedantic -W -Wall -c bc
$ gcc -o def ao bo
bo :( .bss + 0x0):多重定义`x'
ao :( .bss + 0x0):首先定义这里
collect2:ld返回1退出状态

现在,让我们改变 ac

  $ cat ac 
int x; / *注意缺少= 0,因此暂定定义* /

现在编译:

  $ gcc -ansi -pedantic -W -Wall -c ac 
$ gcc -o def ao bo
$ ./def
0

我们可以更改 bc 代替:

  $ cat ac 
int x = 0;
$ cat b.c
#include< stdio.h>
int x; / *初步定义* /
int main(void)
{
printf(%d\\\
,x);
return 0;
}
$ gcc -ansi -pedantic -W -Wall -c ac
$ gcc -ansi -pedantic -W -Wall -c bc
$ gcc -o def ao bo
$ ./def
0

暂定定义定义在C中如果没有其他定义。所以,我们可以改变这两个文件包含 int x; ,这将是合法的。



,您可能在头文件中有一个暂定的定义。我们需要看看实际的代码。



C标准说,以下是未定义的行为(附录J.2p1):


使用具有外部链接的标识符,但在程序中不存在
,只有一个标识符的外部定义,或者不使用标识符,有
存在多个标识符的外部定义。


C ++可能有不同的规则。



编辑:根据这个线程在 comp.lang.c ++ ,C ++没有暂定定义。原因是:


这会避免为内置类型和用户定义的类型使用不同的初始化规则。




现在我几乎可以肯定OP的代码包含什么C在头文件中称为暂定定义,这使它在C中是合法的,在C ++中是非法的。



有关暂定定义的更多信息,以及为什么需要这些定义是在 comp.lang.c 上的优秀文章(由Chris Torek发表)。


This might be a dumb question, but maybe someone can provide some insight.

I have some global variables defined in a header file (yes yes I know that's bad, but this is just a hypothetical situation). I include this header file in two source files, which are then compiled into two object files. The global symbols are not referenced anywhere in the code.

If the source files are C, then it looks like the compiler omits the global symbols and everything links without errors. If the source files are C++, the symbols are included in both object files and then I get linker errors. For C++ I am using extern "C" when I include the header.

I am using the Microsoft compiler from VS2005.

Here is my code:

Header file (test.h):

#ifndef __TEST_H
#define __TEST_H

/* declaration in header file */
void *ptr;

#endif

C Source files:

test1.c

#include "test.h"

int main( ) {
    return 0;
}

test2.c

#include "test.h"

C++ Source Files:

test1.cpp

extern "C" {
#include "test.h"
}

int main( ) {
    return 0;
}

test2.cpp

extern "C" {
#include "test.h"
}

For C, the object files look something like this:

Dump of file test1.obj

File Type: COFF OBJECT

COFF SYMBOL TABLE
000 006DC627 ABS    notype       Static       | @comp.id
001 00000001 ABS    notype       Static       | @feat.00
002 00000000 SECT1  notype       Static       | .drectve
    Section length   2F, #relocs    0, #linenums    0, checksum        0
004 00000000 SECT2  notype       Static       | .debug$S
    Section length  228, #relocs    7, #linenums    0, checksum        0
006 00000004 UNDEF  notype       External     | _ptr
007 00000000 SECT3  notype       Static       | .text
    Section length    7, #relocs    0, #linenums    0, checksum 96F779C9
009 00000000 SECT3  notype ()    External     | _main
00A 00000000 SECT4  notype       Static       | .debug$T
    Section length   1C, #relocs    0, #linenums    0, checksum        0

String Table Size = 0x0 bytes

And for C++ they look something like this:

Dump of file test1.obj

File Type: COFF OBJECT

COFF SYMBOL TABLE
000 006EC627 ABS    notype       Static       | @comp.id
001 00000001 ABS    notype       Static       | @feat.00
002 00000000 SECT1  notype       Static       | .drectve
    Section length   2F, #relocs    0, #linenums    0, checksum        0
004 00000000 SECT2  notype       Static       | .debug$S
    Section length  228, #relocs    7, #linenums    0, checksum        0
006 00000000 SECT3  notype       Static       | .bss
    Section length    4, #relocs    0, #linenums    0, checksum        0
008 00000000 SECT3  notype       External     | _ptr
009 00000000 SECT4  notype       Static       | .text
    Section length    7, #relocs    0, #linenums    0, checksum 96F779C9
00B 00000000 SECT4  notype ()    External     | _main
00C 00000000 SECT5  notype       Static       | .debug$T
    Section length   1C, #relocs    0, #linenums    0, checksum        0

String Table Size = 0x0 bytes

I notice that _ptr is listed as UNDEF when I compile the C source, and it is defined when I compile the C++ source, which results in linker errors.

I understand that this is not a good thing to do in real life, I am just trying to understand why this is different.

Thanks.

解决方案

In C, identifiers have three different types of "linkage":

  1. external linkage: roughly, this is what people mean by "global variables". In common terms, it refers to identifiers that are visible "everywhere".
  2. internal linkage: these are objects that are declared with static keyword.
  3. no linkage: these are objects that are "temporary", or "automatic", such as variables declared inside a function (commonly referred as "local variables").

For objects with external linkage, you can have only one definition. Since your header file defines such an object and is included in two C files, it is undefined behavior (but see below). The fact that your C compiler doesn't complain does not mean it is OK to do so in C. For this, you must read the C standard. (Or, assuming no bugs in your compiler, if it is invoked in a standards-compliant mode, and if it complains about something [gives a diagnostic], it probably means your program isn't compliant.)

In other words, you can't test what is allowed by the language by testing something and checking if your compiler allows it. For this, you must read the standard.

Note that there is a subtle difference between definition and tentative definition.

$ cat a.c
int x = 0;
$ cat b.c
#include <stdio.h>
int x = 0;
int main(void)
{
    printf("%d\n", x);
    return 0;
}
$ gcc -ansi -pedantic -W -Wall -c a.c
$ gcc -ansi -pedantic -W -Wall -c b.c
$ gcc -o def a.o b.o
b.o:(.bss+0x0): multiple definition of `x'
a.o:(.bss+0x0): first defined here
collect2: ld returned 1 exit status

Now, let's change a.c:

$ cat a.c
int x; /* Note missing " = 0", so tentative definition */

Now compile it:

$ gcc -ansi -pedantic -W -Wall -c a.c
$ gcc -o def a.o b.o
$ ./def
0

We can change b.c instead:

$ cat a.c
int x = 0;
$ cat b.c
#include <stdio.h>
int x; /* tentative definition */
int main(void)
{
    printf("%d\n", x);
    return 0;
}
$ gcc -ansi -pedantic -W -Wall -c a.c
$ gcc -ansi -pedantic -W -Wall -c b.c
$ gcc -o def a.o b.o
$ ./def
0

A "tentative definition" becomes "real definition" in C if there is no other definition. So, we could have changed both files to contain int x;, and it would be legal C.

So, you may have a tentative definition in the header file. We need to see the actual code to be sure.

The C standard says that the following is undefined behavior (appendix J.2p1):

An identifier with external linkage is used, but in the program there does not exist exactly one external definition for the identifier, or the identifier is not used and there exist multiple external definitions for the identifier.

C++ may have different rules.

Edit: As per this thread on comp.lang.c++, C++ does not have tentative definitions. The reason being:

This avoids having different initialization rules for built-in types and user-defined types.

(The thread deals with the same question, btw.)

Now I am almost sure that OP's code contains what C calls "tentative definition" in the header file, which makes it legal in C and illegal in C++. We will know for sure only when we see the code though.

More information on "tentative definitions" and why they are needed is in this excellent post on comp.lang.c (by Chris Torek).

这篇关于在C和C ++中通过编译器在对象文件中包含未使用的符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆