汇编器和符号表的目的是什么?符号的地址是什么? [英] What is the purpose of the assembler and symbol table? What is at a symbol's address?

查看:71
本文介绍了汇编器和符号表的目的是什么?符号的地址是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自我的教科书:

要生成汇编语言程序中每条指令的二进制版本,汇编程序必须确定所有标签对应的地址.汇编器在符号表中跟踪分支和数据传输指令中使用的标签.如您所料,该表包含符号和地址对.

To produce the binary version of each instruction in the assembly language program, the assembler must determine the addresses corresponding to all labels. Assemblers keep track of labels used in branches and data transfer instructions in a symbol table. As you might expect, the table contains pairs of symbols and addresses.

为什么需要符号表?如果我们有一个带有标签名称和地址的符号表,那么地址有什么用呢?地址是什么……只是标签的名称?还是标签的说明?

Why does it need a symbol table? If we have a symbol table with a label name and an address, what is the use of the address? What is at the address... just the name of the label? Or is it the instructions of the label?

假设我们在汇编 MIPS 中有这样的指令:

Say we have an instruction like this in assembly MIPS:

add_numbers:
   addi, $s0, $t0, 2

为什么符号表不只存储 add_numbers |<the_binary_representation_of_the_instruction> 而不是 add_numbers |?

Why wouldn't the symbol table just store add_numbers | <the_binary_representation_of_the_instruction> instead of add_numbers | <address_location_of_label>?

推荐答案

标签是地址,它是程序员向汇编器提供地址但不必知道物理地址的一种方式.让工具链为您完成这项工作.

A label IS an address, it is a way for programmers to provide an address to the assembler but not have to know the physical address. Let the toolchain do that work for you.

我不记得我的 MIPS,所以这里有一些伪代码.

I dont remember my MIPS off hand so here is some pseudo code.

loop_top:
   nop
   nop
   sub r0,1
   cmp r0,0
   bne loop_top

取决于指令集,但一般情况下,条件分支将与 pc 相关.通常在组装过程中使用的表格,在表格上进行一次或多次传递,将解析分支和目标之间的距离,以便可以完整地对分支进行编码.上面的大多数指令集都可以一次性解决.loop_top 是一个带有地址的标签,但对于这里的分支来说,它是 pc-relative 的,你不需要知道物理地址.

Depending on the instruction set, but in general the conditional branch will be pc-relative. Tables in general used during assembly with one or more passes on the table will resolve the distance between the branch and the destination so that the branch can be encoded completely. Most instruction sets the above can be resolved in one pass. loop_top is a label that will have an address, but for the branch here it is pc-relative and you dont need to know the physical address.

但是

   call my_fun

一旦通过代码,汇编器发现 my_fun 未在此文件中定义和/或汇编语言有一些语法在使用前将其标记为外部.无论哪种方式,它都是外部的.在汇编此文件时无法解析.因此需要表来指示标签名称,以及指令在该对象中的位置,这取决于汇编程序,它可以暂时将临时偏移量或完整地址填充为零或将其编码为无限循环.链接器稍后确定处理器内存空间中事物的实际地址,链接器最终将在链接时拥有所有(工具链此阶段的相关标签)标签及其地址的表,然后链接器将返回到代码中并修复/创建此调用指令的机器代码,因为它知道该标签的实际地址是什么.

once making a pass on the code, the assembler finds that my_fun is not defined in this file and/or the assembly language has some syntax to mark it as external before used. Either way it is external. Cannot be resolved at the time this file is assembled. So tables are required indicating the label name, and where in this object that instruction lives, depending on the assembler it may fill in the temporary offset or full address as zero for now or encode it as an infinite loop. The linker later determines the actual address for things in the processors memory space, the linker will ultimately have a table of all (relevant labels at this phase of the toolchain) labels and their addresses while linking, then the linker will go back into the code and repair/create the machine code for this call instruction now that it knows what the actual address is for that label.

j hello

对象:

Disassembly of section .text:

00000000 <.text>:
   0:   08000000    j   0x0
   4:   00000000    nop

另一个对象:

.globl hello
hello:
    j hello

.word hello

链接他们

Disassembly of section .text:

00001000 <_ftext>:
    1000:   08000402    j   1008 <hello>
    1004:   00000000    nop

00001008 <hello>:
    1008:   08000402    j   1008 <hello>
    100c:   00000000    nop
    1010:   00001008    0x1008

作为对象,所有工具链必须继续使用标签 hello 用作稍后解析的地址.在这种情况下,在链接时,链接器通过对象工作,计算字节数,形成标签及其地址表.在第一次或其他一些传递期间,它将根据需要更改指令或数据以解析这些标签.

As objects all the toolchain has to go on is the label hello being used as an address to be resolved later. In this case at link time, the linker works through the objects, counting bytes making a table of labels and their addresses. During the first or some other pass it will change the instructions or data as needed to resolve these labels.

现在的老式汇编器从同一个源文件中进行汇编和链接工作,语句汇编器必须确定与所有标签对应的地址".一般而言,通常使用工具链的汇编器不是执行链接器工作的.所以引用语句可以使用一些改进.但希望这表明标签是地址,它们代表一个尚未确定的地址,因此代码比这样的代码更容易编写

Now old school assemblers that did the job of assembling and linking from the same source file, the statement "assembler must determine the addresses corresponding to all labels". It is not the assembler in general with commonly used toolchains that does the linker work. So that quoted statement could use some improvement. But hopefully this demonstrates that labels are addresses, they represent a yet to be determined address so the code is easier to write than something like this

  nop
  nop
  j pc-2

然后如果您添加另一条指令

then if you add another instruction

  nop
  add r0,r1
  nop
  j pc-3

   j 0x1008

然后必须花费大量时间重新编写程序以将每个地址硬编码到程序中.添加/删除一行,并且必须更改许多其他代码.表示地址的标签使这一切变得更加容易,工具链确定地址,然后返回并基本上用地址替换标签......

then have to spend a significant amount of time re-writing the program to get each and every address hardcoded into the program. Add/remove a single line and a lot of other code has to be changed. Labels representing addresses make that all significantly easier and the toolchain determines addresses, then goes back and replaces the labels with addresses basically...

添加了一个 nop:

Disassembly of section .text:

00001000 <_ftext>:
    1000:   08000403    j   100c <hello>
    1004:   00000000    nop
    1008:   00000000    nop

0000100c <hello>:
    100c:   08000403    j   100c <hello>
    1010:   00000000    nop
    1014:   0000100c

如果我们没有标签而不得不对地址进行硬编码,那么由于 nop,您将不得不更改这三个位置.一条线.如果你添加了几十行,几百行.你将如何跟踪这一切?通过在评论中添加标签?一遍又一遍地组装、拆卸和修补源代码,直到它看起来有点正确并希望没有错误.

If we didnt have labels and had to hardcode the address instead then you would have to change those three places as a result of the nop. One line. If you added dozens of lines, hundreds. How would you keep track of it all? By putting labels in comments? assemble and disassemble and patch up the source over and over again until it looked somewhat right and hope for no bugs.

mips-elf-readelf -s so.elf

Symbol table '.symtab' contains 14 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00001000     0 SECTION LOCAL  DEFAULT    1 
     2: 00400000     0 SECTION LOCAL  DEFAULT    2 
     3: 00400018     0 SECTION LOCAL  DEFAULT    3 
     4: 00000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000a010     0 NOTYPE  LOCAL  DEFAULT    2 _gp
     6: 00002018     0 NOTYPE  GLOBAL DEFAULT    4 _fdata
     7: 0000100c     0 OBJECT  GLOBAL DEFAULT    1 hello
     8: 00001000     0 NOTYPE  GLOBAL DEFAULT    1 _ftext
     9: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND _start
    10: 00002018     0 NOTYPE  GLOBAL DEFAULT    2 __bss_start
    11: 00002018     0 NOTYPE  GLOBAL DEFAULT    2 _edata
    12: 00002018     0 NOTYPE  GLOBAL DEFAULT    2 _end
    13: 00002018     0 NOTYPE  GLOBAL DEFAULT    2 _fbss

这是一个感兴趣的:

     7: 0000100c     0 OBJECT  GLOBAL DEFAULT    1 hello

标签hello一旦被组装并链接成最终的二进制文件就等于地址0x100C

the label hello once assembled and linked into a final binary is equal to address 0x100C

这篇关于汇编器和符号表的目的是什么?符号的地址是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆