链接到底如何工作? [英] How exactly does linking work?

查看:68
本文介绍了链接到底如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对编译过程的理解方式:

The way I understand the compilation process:

1)预处理:将所有宏替换为它们的实际值,删除所有注释,等等.将#include语句替换为所包含文件的文字文本.

1) Preprocessing: All of your macros are replaced with their actual values, all comments are removed, etc. Replaces your #include statements with the literal text of the files you've included.

2)编译:这里不会深入探讨,但是结果是一个汇编文件,可以针对您所使用的任何体系结构.

2) Compilation: Won't drill down too deep here, but the result is an assembly file for whatever architecture you are on.

3)汇编:获取汇编文件并将其转换为二进制指令,即机器代码.

3) Assembly: Takes the assembly file and converts it into binary instructions, i.e., machine code.

4)链接:这是我很困惑的地方.至此,您已经有了一个可执行文件.但是,如果您实际上运行该可执行文件,会发生什么?您可能已经包含了* .h文件,而这些文件仅包含函数原型吗?那么,如果您实际上从这些文件中调用了其中一个函数,则该函数将没有定义,并且您的程序将崩溃?

4) Linking: This is where I'm confused. At this point you have an executable. But if you actually run that executable what happens? Is the problem that you may have included *.h files, and those only contain function prototypes? So if you actually call one of the functions from those files, it won't have a definition and your program will crash?

如果是这种情况,在引擎盖下链接做什么到底是什么?它如何找到与所包含的.h相关联的.c文件,以及如何将其注入您的机器代码中?不需要为该文件再次经历整个编译过程吗?

If that's the case, what exactly does linking do, under the hood? How does it find the .c file associated with the .h that you included, and how does it inject that into your machine code? Doesn't it have to go through the whole compilation process again for that file?

现在,我开始了解链接有两种类型,即动态链接和静态链接.当您实际上为所创建的每个可执行文件重新编译库的源代码时,它是静态的吗?我不太了解动态链接的工作原理.因此,您编译了一个由所有使用它的进程共享的可执行文件库吗?那怎么可能呢?它不是在尝试访问它的进程的地址空间之外吗?另外,对于动态链接,您是否还需要在某个时候编译该库?它只是一直坐在内存中等待使用吗?什么时候编译?

Now, I've come to understand that there are two types of linking, dynamic and static. Is static when you actually recompile the source of the library for every executable you create? I don't quite understand how dynamic linking would work. So you compile one executable library that is shared by all of your processes that use it? How is that possible, exactly? Wouldn't it be outside of the address space of the processes trying to access it? Also, for dynamic linking, don't you still need to compile the library at some juncture in time? Is it just sitting there constantly in memory waiting to be used? When is it compiled?

您可以通过以上内容解决所有的误解,错误的假设并替代您的正确解释吗?

Can you go through the above and clear up all of the misunderstandings, wrong assumptions there and substitute your correct explanation?

推荐答案

这时您已经有了一个可执行文件.

At this point you have an executable.

不.此时,您拥有的对象文件本身就不能执行.

No. At this point, you have object files, which are not, in themselves, executable.

但是,如果您实际运行该可执行文件,会发生什么?

But if you actually run that executable what happens?

类似这样的东西:

h2co3-macbook:~ h2co3$ clang -Wall -o quirk.o quirk.c -c
h2co3-macbook:~ h2co3$ chmod +x quirk.o
h2co3-macbook:~ h2co3$ ./quirk.o
-bash: ./quirk.o: Malformed Mach-o file

告诉您它不是可执行文件.

I told you it was not an executable.

问题可能是您包含* .h文件,而这些文件仅包含函数原型吗?

Is the problem that you may have included *.h files, and those only contain function prototypes?

实际上非常接近.翻译单元(.c文件)(通常)被转换为表示其功能的汇编/机器代码.如果它调用一个函数,则文件中将有对该函数的引用,但没有定义.

Pretty close, actually. A translation unit (.c file) is (generally) transformed to assembly/machine code that represents what it does. If it calls a function, then there will be a reference to that function in the file, but no definition.

那么,如果您实际上从那些文件中调用了一个函数,它将没有定义,并且您的程序将崩溃?

So if you actually call one of the functions from those files, it won't have a definition and your program will crash?

正如我所说,它甚至不会运行.让我重复一遍:目标文件不可执行.

As I've stated, it won't even run. Let me repeat: an object file is not executable.

链接到底是做什么的?如何找到与您包含的.h相关的.c文件[...]

what exactly does linking do, under the hood? How does it find the .c file associated with the .h that you included [...]

不是.它会查找从.c文件生成的其他目标文件,最后是库(本质上只是其他目标文件的集合).

It doesn't. It looks for other object files generated from .c files, and eventually libraries (which are essentially just collections of other object files).

它找到它们是因为您告诉它要寻找的内容.假设您有一个包含两个互相调用函数的.c文件组成的项目,那么它将不起作用:

And it finds them because you tell it what to look for. Assuming you have a project which consists of two .c files which call each other's functions, this won't work:

gcc -c file1.c -o file1.o
gcc -c file2.c -o file2.o
gcc -o my_prog file1.o

它将失败,并显示链接器错误:链接器将找不到在file2.c(和file2.o)中实现的功能的定义.但这会起作用:

It will fail with a linker error: the linker won't find the definition of the functions implemented in file2.c (and file2.o). But this will work:

gcc -c file1.c -o file1.o
gcc -c file2.c -o file2.o
gcc -o my_prog file1.o file2.o

[...]以及如何将其注入您的机器代码中?

[...] and how does it inject that into your machine code?

对象文件包含对其调用的函数的存根引用(通常以函数入口点地址或明确的,易于理解的名称的形式).然后,链接器查看每个库和目标文件,找到引用(如果找不到函数定义,则引发错误),然后将存根引用替换为实际的调用此函数"机器代码指令. (是的,这在很大程度上得到了简化,但是如果您不询问特定的体系结构和特定的编译器/链接器,就很难更准确地说明……)

Object files contain stub references (usually in the form of function entry point addresses or explicit, human-readable names) to the functions they call. Then, the linker looks at each library and object file, finds the references (, throws an error if a function definition couldn't be found), then substitutes the stub references with actual "call this function" machine code instructions. (Yes, this is largely simplified, but without you asking about a specific architecture and a specific compiler/linker, it's hard to tell more precisely...)

当您为创建的每个可执行文件实际重新编译库的源代码时,它是静态的吗?

Is static when you actually recompile the source of the library for every executable you create?

不.静态链接意味着实际上将库的目标文件的机器代码复制/合并到最终的可执行文件中.动态链接意味着将库一次加载到内存中,然后在您的可执行文件启动时由操作系统解析上述存根函数引用.库中的任何机器代码都不会复制到最终可执行文件中. (因此,在这里,工具链中的链接器仅完成部分工作.)

No. Static linkage means that the machine code of the object files of a library are actually copied/merged into your final executable. Dynamic linkage means that a library is loaded into memory once, then the aforementioned stub function references are resolved by the operating system when your executable is launched. No machine code from the library will be copied into your final executable. (So here, the linker in the toolchain only does part of the job.)

以下内容可能有助于您获得启发:如果您静态链接可执行文件,则该文件将是独立的.它可以在任何地方运行(无论如何都在兼容的体系结构上).如果动态链接它,则只有在该特定计算机上安装了程序引用的所有库的情况下,它才会在该计算机上运行.

The following may help you to achieve enlightenment: if you statically link an executable, it will be self-contained. It will run anywhere (on a compatible architecture anyway). If you link it dynamically, it will only run on a machine if that particular machine has all the libraries installed that the program references.

那么您编译一个由所有使用它的进程共享的可执行文件库吗?那怎么可能呢?它不是在尝试访问它的进程的地址空间之外吗?

So you compile one executable library that is shared by all of your processes that use it? How is that possible, exactly? Wouldn't it be outside of the address space of the processes trying to access it?

操作系统的动态链接器/加载器组件负责所有这些工作.

The dynamic linker/loader component of the OS takes care all of that.

另外,对于动态链接,您是否还需要及时编译库?

Also, for dynamic linking, don't you still need to compile the library at some juncture in time?

正如我已经提到的:是的,它已经被编译.然后在某个时刻(通常是第一次使用时)将其加载到内存中.

As I've already mentioned: yes, it is already compiled. Then it is loaded at some point (typically when it's first used) into memory.

何时编译?

When is it compiled?

需要一段时间才能使用它.通常,会先编译一个库,然后将其安装到系统上的某个位置,以便操作系统和编译器/链接器知道其存在,然后您可以开始编译(um,链接)使用该库的程序图书馆.不早.

Some time before it could be used. Typically, a library is compiled, then installed to a location on your system so that the OS and the compiler/linker know about its existence, then you can start compiling (um, linking) programs that use that library. Not earlier.

这篇关于链接到底如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆