目标文件和共享目标文件之间的关系 [英] Relation between object file and shared object file

查看:162
本文介绍了目标文件和共享目标文件之间的关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

共享对象( .so )文件和对象( .o )文件之间的关系是什么? / p>

可以通过示例解释吗?

解决方案

以下C源文件,调用 name.c

  #include < stdio.h> 
#include< stdlib.h>

void print_name(const char * name)
{
printf(My name is%s\\\
,name);
}

编译时,使用 cc name.c 您将生成 name.o 。 .o包含在name.c中定义的所有函数和变量的编译代码和数据,以及将它们的名称与实际代码相关联的索引。如果你看看那个索引,用 nm 工具(在Linux和许多其他Unix上可用),你会注意到两个条目:

  00000000 T print_name 
U printf

这意味着:.o中存在两个符号(函数或变量的名称,但不是类,结构或任何类型的名称)。标有 T 的第一个实际上在 name.o 中包含其定义。另一个标有 U 的只是一个参考。可以在此处找到 print_name 的代码,但 printf 的代码不能。当你的实际程序运行时,它需要找到引用的所有符号,并在其他对象文件中查找它们的定义,以便链接在一起成为一个完整的程序或完整的库。因此,对象文件是源文件中的定义,转换为二进制形式,可用于放入完整的程序。



您可以将.o文件一个,但你不:一般有很多他们,他们是一个实现细节。你真的更愿意将它们全部收集到相关对象的包中,并且具有公认的名称。这些包称为,它们有两种形式:静态库和动态库。



Unix)几乎总是后缀 .a (示例包括 libc.a ,它是C核心库, libm.a 这是C数学库)等。继续该示例,您将使用 ar rc libname.a name.o 构建您的静态库。如果您在 libname.a 上运行 nm ,您将看到:

  name.o:
00000000 T print_name
U printf

正如你可以看到的,它主要是一个对象文件的大表,索引查找所有中的名称。就像对象文件一样,它包含每个 .o 中定义的符号和它们引用的符号。如果您要连接另一个 .o(例如 date.o print_date ) ,你会看到类似上面的另一个条目。



如果你将一个静态库链接到一个可执行文件,它将整个库嵌入到可执行文件中。这就像在所有个别的 .o 文件中链接。你可以想象,这可以使你的程序非常大,特别是如果你使用(因为大多数现代应用程序)很多库。



动态或共享库后缀 .so 。它像其静态模拟一样,是一个大型的目标文件表,引用所有编译的代码。你可以用 cc -shared libname.so name.o 来构建它。看看 nm 与静态库有点不同。在我的系统上,它包含大约两打的符号,其中只有两个 print_name printf

  00001498 a _DYNAMIC 
00001574 a _GLOBAL_OFFSET_TABLE_
w _Jv_RegisterClasses
00001488 d __CTOR_END__
00001484 d __CTOR_LIST__
00001490 d __DTOR_END__
0000148c d __DTOR_LIST__
00000480 r __FRAME_END__
00001494 d __JCR_END__
00001494 d __JCR_LIST__
00001590 A __bss_start
w __cxa_finalize @@ GLIBC_2.1.3
00000420 t __do_global_ctors_aux
00000360 t __do_global_dtors_aux
00001588 d __dso_handle
w __gmon_start__
000003f7 t __i686.get_pc_thunk.bx
00001590 A _edata
00001594 Aend
00000454 T _fini
000002f8 T _init
00001590 b completed.5843
000003c0 t frame_dummy
0000158c d p.5841
000003fc T print_name
U printf @@ GLIBC_2.0

共享库与静态库的区别非常重要:它不会嵌入到你的最终可执行文件中。相反,可执行文件包含对该共享库的引用,而不是在链接时,而是在运行时。这有许多优点:




  • 您的可执行文件要小得多。它只包含通过目标文件明确链接的代码。

  • 您可以在多个可执行文件之间共享(因此名称)一个库的位。

  • 如果仔细考虑二进制兼容性,您可以在程序运行之间更新库中的代码,程序将会选择新的库,而不需要更改它。



这有一些缺点:




  • 链接程序需要时间。对于共享库,有些时候会延迟到每次可执行文件运行时。

  • 这个过程更复杂。共享库中的所有其他符号都是使运行时库链接所需的基础结构的一部分。

  • 您的库的不同版本之间存在微妙的不兼容的风险。在Windows上,这被称为DLL地狱。



(如果你想到它,许多是程序使用或不使用的原因使用引用和指针,而不是直接将类的对象嵌入到其他对象中,类比是很直接的。)



好吧,这是很多细节,跳过了很多,例如链接过程如何实际工作。我希望你能跟随它。如果不要求澄清。


what is the relation between shared object(.so) file and object(.o) file?

can you please explain via example?

解决方案

Let's say you have the following C source file, call it name.c

#include <stdio.h>
#include <stdlib.h>

void print_name(const char * name)
{
    printf("My name is %s\n", name);
}

When you compile it, with cc name.c you generate name.o. The .o contains the compiled code and data for all functions and variables defined in name.c, as well as index associated their names with the actual code. If you look at that index, say with the nm tool (available on Linux and many other Unixes) you'll notice two entries:

00000000 T print_name
         U printf

What this means: there are two symbols (names of functions or variables, but not names of classes, structs, or any types) stored in the .o. The first, marked with T actually contains its definition in name.o. The other, marked with U is merely a reference. The code for print_name can be found here, but the code for printf cannot. When your actual program runs it will need to find all the symbols that are references and look up their definitions in other object files in order to be linked together into a complete program or complete library. An object file is therefore the definitions found in the source file, converted to binary form, and available for placing into a full program.

You can link together .o files one by one, but you don't: there are generally a lot of them, and they are an implementation detail. You'd really prefer to have them all collected into bundles of related objects, with well recognized names. These bundles are called libraries and they come in two forms: static and dynamic.

A static library (in Unix) is almost always suffixed with .a (examples include libc.a which is the C core library, libm.a which is the C math library) and so on. Continuing the example you'd build your static library with ar rc libname.a name.o. If you run nm on libname.a you'll see this:

name.o:
00000000 T print_name
         U printf

As you can see it is primarily a big table of object files with an index finding all the names in it. Just like object files it contains both the symbols defined in every .o and the symbols referred to by them. If you were to link in another .o (e.g. date.o to print_date), you'd see another entry like the one above.

If you link in a static library into an executable it embeds the entire library into the executable. This is just like linking in all the individual .o files. As you can imagine this can make your program very large, especially if you are using (as most modern applications are) a lot of libraries.

A dynamic or shared library is suffixed with .so. It, like its static analogue, is a large table of object files, referring to all the code compiled. You'd build it with cc -shared libname.so name.o. Looking at with nm is quite a bit different than the static library though. On my system it contains about two dozen symbols only two of which are print_name and printf:

00001498 a _DYNAMIC
00001574 a _GLOBAL_OFFSET_TABLE_
         w _Jv_RegisterClasses
00001488 d __CTOR_END__
00001484 d __CTOR_LIST__
00001490 d __DTOR_END__
0000148c d __DTOR_LIST__
00000480 r __FRAME_END__
00001494 d __JCR_END__
00001494 d __JCR_LIST__
00001590 A __bss_start
         w __cxa_finalize@@GLIBC_2.1.3
00000420 t __do_global_ctors_aux
00000360 t __do_global_dtors_aux
00001588 d __dso_handle
         w __gmon_start__
000003f7 t __i686.get_pc_thunk.bx
00001590 A _edata
00001594 A _end
00000454 T _fini
000002f8 T _init
00001590 b completed.5843
000003c0 t frame_dummy
0000158c d p.5841
000003fc T print_name
         U printf@@GLIBC_2.0

A shared library differs from a static library in one very important way: it does not embed itself in your final executable. Instead the executable contains a reference to that shared library that is resolved, not at link time, but at run-time. This has a number of advantages:

  • Your executable is much smaller. It only contains the code you explicitly linked via the object files. The external libraries are references and their code does not go into the binary.
  • You can share (hence the name) one library's bits among multiple executables.
  • You can, if you are careful about binary compatibility, update the code in the library between runs of the program, and the program will pick up the new library without you needing to change it.

There are some disadvantages:

  • It takes time to link a program together. With shared libraries some of this time is deferred to every time the executable runs.
  • The process is more complex. All the additional symbols in the shared library are part of the infrastructure needed to make the library link up at run-time.
  • You run the risk of subtle incompatibilities between differing versions of the library. On Windows this is called "DLL hell".

(If you think about it many of these are the reasons programs use or do not use references and pointers instead of directly embedding objects of a class into other objects. The analogy is pretty direct.)

Ok, that's a lot of detail, and I've skipped a lot, such as how the linking process actually works. I hope you can follow it. If not ask for clarification.

这篇关于目标文件和共享目标文件之间的关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆