不清楚与链接器的工作 [英] not clear with the job of the linker

查看:132
本文介绍了不清楚与链接器的工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Windows上使用C语言。此问题之前是在程序中的标识符会发生什么? 。我打破了减少没有。的问题。这是一个独立的查询(不依赖于前面的问题)



如果没有链接(即我不使用任何库,我知道不会有任何用途。)链接器会改变汇编器的目标代码输出?如果是这样,它会改变?



我听说LINKER也做一些内存映射的操作。我不明白怎么样。程序没有运行,它只是在制造阶段。链接器如何映射到内存?它会是什么样子? LINKER的功能是什么?



当人们引用重定位,地址绑定。我真的不知道他们的意思。它是什么&它的目的是什么?



一些调试器显示如下信息:
调用堆栈:0xfffef32,0xf3234fe等。它在运行时权?或者当人们引用符号

< >或符号表。它们是指标识符(变量名,常数名,函数名)吗?



我在互联网上搜索过信息但找不到任何有用的东西。可能是我不知道要搜索什么。我不想读这本大书。但如果有任何文章,教程清楚的概念。这也是有帮助的。



我是一个新手程序员。

解决方案

当你编译一个源文件时,通常是这样的:由编译器/汇编器分成几个部分。作为一个假设的例子,假设使用了以下部分:




  • .text - 包含所有可执行代码

  • .const - 包含常量数据

  • .data - 包含读/写初始化数据

  • .bss - 包含读/写未初始化数据



在单个源文件中,编译器/汇编器将适当的内容分配给相应的部分,并给出在从零开始。



例如:

  int i; 
const j = 3;
int k = 4;
int l;
int main()
{
return 1;
}

这可能会导致以下符号表:

 符号段偏移
i .bss 0
j .const 0
k .data 0
l .bss 4
main .text 0

在目标文件中,除了符号表,每个部分可以保留。在这个例子中,.text段将包含return 1的目标代码,const段将包含3,数据段将包含4. .bss段不需要在目标文件中,因为变量



链接器可能做的第一件事是连接输入对象文件的所有部分,并相应地调整符号偏移量。



现在我们得到所谓的重定位或地址绑定。假设在假设的系统中,可执行代码从地址0x1000开始。我们还要说一个程序的数据段要在可执行代码之后的偶数页边界处开始。链接器将分配0x1000作为连接的.text段的基址,并调整所有符号。然后,.const,.data和.bss段的基础类似,将它们放置在内存的适当位置。



有时在一个段中有符号引用。这些引用必须由链接器更新以反映所引用符号的最终位置。对象文件可以包含类似

的重定位记录

 节偏移符号
.text 0x1234 foo

链接器将转到每个节中的每个偏移量,并更新该值以反映最终的符号值。 p>

完成所有这一切后,生成的绝对对象文件可以加载到内存中(当然在正确的位置!)并执行。


I'm using C language on windows. This question was previously part of What happens to identifiers in a program? . I broke it to reduce no. of questions. This is a standalone query (doesn't depend on the previous question)

If there is nothing to link (i.e.. I'm not using any libraries. I know it wont be of any use.) will the linker change the object code output of assembler? If so what does it change?

I heard that LINKER also does the operation of some memory mapping. I don't understand how. The program is not running, its just in the manufacturing stage. How could linker map to memory? How would it look like? What all are the functions of LINKER?

When people refer to "relocation" , "address binding". I don't really get what they mean. What is it & what is its purpose?

Some debuggers show info like : call stack: 0xfffef32 , 0xf3234fe etc.. Its at the run time right? or is the the memory addresses of so called "memory mapping" of linker?

when people refer to something like symbols or symbol table. Do they mean identifiers(variable names, constant names, function names)?

I searched info on internet but couldn't find anything useful. May be I'm not sure what to search for. I don't want to read big books on this. But if there are any articles, tutorials which clear concepts. That would also be helpful.

I'm a novice programmer. So, it would be great you can explain in simple but technical terms.

解决方案

When you compile a source file, it is usually divided up by the compiler/assembler into several sections. As a hypothetical example imagine that the following sections are used:

  • .text - contains all the executable code
  • .const - contains constant data
  • .data - contains read/write initialized data
  • .bss - contains read/write uninitialized data

In a single source file, the compiler/assembler allocates the appropriate stuff to the appropriate sections and gives the symbols that are used offsets in the section starting from zero.

For example:

int i;
const j = 3;
int k = 4;
int l;
int main()
{
return 1;
}

This could result in the following symbol table:

Symbol Section Offset
i      .bss    0
j      .const  0
k      .data   0
l      .bss    4
main   .text   0

In the object file, in addition to the symbol table, the data in each section could be kept. In this example, the .text section would contain the object code for "return 1", the const section would contain 3, the data section would contain 4. The .bss section would not need to be in the object file, because the variables haven't been initialized.

The first thing a linker might do is to concatenate all the sections of the input object file and adjust the symbol offsets accordingly.

Now we get to what called "relocation" or "address binding". Let's say that in a hypothetical system, executable code starts at address 0x1000. Let's also say that the data sections of a program want to start at an even page boundary after the executable code. The linker would assign 0x1000 as the base of the concatenated .text sections and adjust all the symbols. Then the base of the .const, .data, and .bss sections similarly to place them in appropriate places in memory.

Sometimes there are symbolic references in a section. These references have to be updated by the linker to reflect the final position of the symbol referred to. The object file could contain "relocation records" that look like

section offset symbol
.text   0x1234 foo

The linker will go to each offset in each section and update the value there to reflect the final symbol value.

After all this is done, the resulting "absolute" object file can be loaded into memory (at the proper spot, of course!) and executed.

这篇关于不清楚与链接器的工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆