函数调用:标签到内存地址 [英] Function Call: Labels into memory addresses

查看:116
本文介绍了函数调用:标签到内存地址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难理解事件的正确顺序. 编译以抽象语言编写的程序时,会将其翻译为机器代码. 随后,仅在程序运行之后,才将其加载到代码段中的ram中. 此时,程序中的每条指令将位于特定的存储器地址上. 在汇编中调用函数时,通常在Call语句后跟一个标签. 我假设此标签将由编译器替换为函数的内存地址. 这是我绝对无法理解的地方. 如果仅在程序运行时将指令加载到内存中,从而为每个指令获取自己的内存地址,那么编译器如何知道标签所对应的内存地址? 如果该函数还没有存储在内存中,那么该程序如何在不再可用标签的情况下以二进制代码编译的方式知道与该标签相对应的内存地址,该函数将在执行时加载到该地址?我有点困惑.救救我.

I have difficulty to understand the correct sequence of events. When a program written in abstract language is compiled, it is translated into machine code. Subsequently, only after the program has run, it is loaded into the ram, in the code segment. At this point, each instruction in the program will be on a specific memory address. When a function is called in assembly, the Call statement is typically followed by a label. I assume this label will be replaced with the function's memory address by the compiler. And this is where I absolutely can't understand. If the instructions are loaded into memory only when the program is running, thus obtaining each instruction its own memory address, how does the compiler know the memory address to which the label corresponds? If the function is not yet in memory, how can the program, compiled in binary code where the labels are nomore available, know the memory address, corrisponding to that label, where the function will be loaded at the moment of execution? I am a bit confused. Help me.

推荐答案

一个程序包含几个节"(有些是可选的):

A program contains several "sections" (some are optional):

  • 保存代码的部分,通常称为文本"部分
  • 保存可变全局数据初始值的部分
  • 保存不可变常量的部分,通常称为rodata
  • 具有一组重定位记录的部分

一个部分作为连续的一块或一块内存存储在磁盘上的程序文件中.

A section is stored as a contiguous chunk or block of memory in the program file on disc.

加载器创建内存块并将代码,数据和rodata加载到其中;取决于os,将由加载器创建堆栈,也可能由创建子进程的父进程派生创建堆栈.

The loader creates memory chunks and loads the code, data, rodata into those; a stack will have been created, depending on the os, either by the loader, but also possibly by the forking of the parent process that creates the child process.

知道最终地址后,加载程序还会处理重定位记录.这些重定位描述了需要在文本和数据节中的何处进行更新,以便将这些节的最终地址加载到内存中.

Knowing the final addresses, the loader also processes the relocation records.  These relocations describe where in the text and data sections updates are needed for the final addresses of the sections loaded into memory.

重定位机制是通用的,因为:代码可以引用代码,代码可以引用数据,数据可以引用代码,数据可以引用数据.

The relocation mechanism is general purpose, as: code can refer to code, code can refer to data, data can refer to code, and data can refer to data.

单个重定位记录描述了需要更新的参考.每条记录描述:

A single relocation record describes a reference that needs to be updated.  Each record describes:

  1. 引荐来源-在文本或数据部分以什么偏移量进行地址更新
  2. 引荐目标-指的是哪个部分:代码或数据
  3. 要进行哪种更新(某些体系结构具有复杂的指令编码)

  1. a referring source — at what offset in the text or data section to make an address update
  2. a referring target — which section is being referred to: code or data
  3. what kind of update to make (some architectures have complex instruction encodings)

某些更新用于普通指针,而其他则用于指令.具有复杂指令偏移/立即编码的指令集体系结构(例如MIPS,RISC V,HP-PA)需要告知立即编码方法.

Some updates are for ordinary pointers, while others are for instructions.  Instruction set architectures that have complex instruction offset/immediate encodings, like MIPS, RISC V, HP-PA, need to inform of the immediate encoding method.

通常,引荐来源网址已具有偏移量,因此更新是将要引荐的部分的基数与引荐来源处已存在的偏移量相加/求和的问题.

Usually the referrer already has an offset, so the update is a matter of addition/summation of the base of the section being referred to, to the offset already in place at the referring source.

程序中的其他元数据描述了从何处开始,例如初始的程序计数器,将作为文本部分的偏移量.

Other metadata in the program describes where to start, e.g. the initial the program counter, which would be as an offset into the text section.

当今大多数处理器都支持(如fuz所描述的)位置无关代码(PIC).通常是通过 pc相对寻址来完成的.处理器使用pc相对的寻址模式在单个文本部分内执行分支和调用. ,这些说明不需要搬迁记录.

Most processors today support (as fuz describes) position independent code (PIC).  This is typically done via pc-relative addressing.  The processor performs branches and calls within the single text section using pc-relative addressing modes, and thus, no relocation records are required for these instructions.

动态加载的库增加了复杂性,因为每个DLL和要运行的主程序每个都具有程序的格式,即它们每个都具有自己的部分;每个都有自己的文本部分.重定位还将能够描述对符号导入的引用,并得到包含符号名称,导入和导出的其他部分的支持.

Dynamically loaded libraries add complexity since each DLL, and the main program to run, each have the format of a program, i.e. they each will have their own sections; each has its own text section.  The relocations will also be capable of describing references to symbol imports, supported by additional sections holding symbol names, imports, and exports.

目标文件(编译器输出,预链接)通常也遵循这种格式.单个目标文件包含这些部分,以及重定位记录,符号名称,导入,导出.链接器的工作是将目标文件合并到单个程序或更大的目标文件中.在合并过程中,链接器会解析一些重定位,但不一定能解决所有重定位,因此os loader可能仍有一些重定位.

Object files (compiler output, pre-linking) typically follow this format as well.  A single object file has these sections, with relocation records, symbol names, imports, exports.  The linker's job is to merge object files into a single program or larger object file.  During merge the linker resolves some relocations, but it cannot necessarily resolve all of them, so some may remain for the os loader to resolve.

让我们想象一下,在使用PIC的系统上,有一个引用:从一个目标文件到另一个目标文件的调用(代码到代码),并且链接程序合并了这些目标文件.调用者中将有一个重定位记录,该记录引用导入的符号名称(在另一个目标文件中,导出的符号定义为带有其文本部分的某些偏移量).一旦两个目标文件的节合并在一起(例如,通过简单地将它们合并为一个较大的文本节),则调用中现在有了节内引用,链接器可以计算出调用者和被调用者的地址之间的差值,并且这些将不会在以后的链接或加载中更改.链接器将使用该增量调整调用指令中的偏移量/立即数,并且在知道此引用已解决的情况下,将在合并中忽略此重定位记录.

Let's imagine that, on a system using PIC, there is a reference: a call (code-to-code), from one object file to another, and that the linker merges these object files.  There will be a relocation record in the caller that refers to an imported symbol name (and in the other object file, an export of a symbol defined as some offset with its text section).  Once the two object files' sections are merged (e.g. by simply concatenating them into one larger text section), the call there is now an intra-section reference, and the linker can compute the delta between the addresses of the caller and callee, and these will not change by future linking or loading.  The linker will adjust the offset/immediate in the call instruction with that delta, and, knowing this reference is now resolved, omits this relocation record in the merge.

有关参考,请参见:

  • ELF (Executable and Linkable Format)
  • COFF (Common Object File Format)
  • Windows Portable Executable Format
  • Relocation
  • Object file
  • Executable
  • Position Independent Code
  • Addressing Mode

这篇关于函数调用:标签到内存地址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆