为什么在NASM中使用相对RIP寻址? [英] Why use RIP-relative addressing in NASM?
问题描述
我有一个适用于Mac OS X的程序集hello world程序,如下所示:
I have an assembly hello world program for Mac OS X that looks like this:
global _main
section .text
_main:
mov rax, 0x2000004
mov rdi, 1
lea rsi, [rel msg]
mov rdx, msg.len
syscall
mov rax, 0x2000001
mov rdi, 0
syscall
section .data
msg: db "Hello, World!", 10
.len: equ $ - msg
我想知道行 lea rsi,[rel msg]
.为什么NASM强迫我这样做?据我了解, msg
只是指向可执行文件中某些数据的指针,并且执行 mov rsi,msg
会将该地址放入 rsi
中.但是,如果我将 lea rsi,[rel msg]
行替换为,NASM会引发此错误(注意:我正在使用命令 nasm -f macho64 hello.asm
):
I was wondering about the line lea rsi, [rel msg]
. Why does NASM force me to do that? As I understand it, msg
is just a pointer to some data in the executable and doing mov rsi, msg
would put that address into rsi
. But if I replace the line lea rsi, [rel msg]
with , NASM throws this error (note: I am using the command nasm -f macho64 hello.asm
):
hello.asm:9: fatal: No section for index 2 offset 0 found
为什么会这样? lea
有什么特别之处,而 mov
无法做到?我怎么知道什么时候使用每个?
Why does this happen? What is so special about lea
that mov
can't do? How would I know when to use each one?
推荐答案
lea
有什么特别之处,而mov
无法做到?
What is so special about
lea
thatmov
can't do?
mov reg,imm
将 immediate 常量加载到其目标操作数中.立即常数直接在操作码中编码,例如如果 someVar
的地址为 0x00ABCDEF
,则 mov eax,someVar
将被编码为 B8 EF CD AB 00
.IE.要使用 imm
作为 msg
的地址对此类指令进行编码,您需要知道 msg
的确切地址.在与位置无关的代码中,您不会先验地知道它.
mov reg,imm
loads an immediate constant into its destination operand. Immediate constant is encoded directly in the opcode, e.g. mov eax,someVar
would be encoded as B8 EF CD AB 00
if address of someVar
is 0x00ABCDEF
. I.e. to encode such an instruction with imm
being address of msg
you need to know exact address of msg
. In position-independent code you don't know it a priori.
mov reg,[expression]
加载位于 expression
描述的地址处的值.x86指令的复杂编码方案允许具有非常复杂的 expression
:通常是 reg1 + reg2 * s + displ
,其中 s
可以是0、1、2、4, reg1
和 reg2
可以是通用寄存器,也可以是零,而 displ
是立即移位.在64位模式下, expression
可以具有另一种形式: RIP + displ
,即,相对于下一条指令计算地址.
mov reg,[expression]
loads the value located at address described by expression
. The complex encoding scheme of x86 instructions allows to have quite complex expression
: in general it's reg1+reg2*s+displ
, where s
can be 0,1,2,4, reg1
and reg2
can be general-purpose registers or zero, and displ
is immediate displacement. In 64-bit mode expression
can have one more form: RIP+displ
, i.e. the address is calculated relative to the next instruction.
lea reg,[expression]
使用所有复杂的地址计算方式将地址本身加载到 reg
中(与不同mov
,它取消对计算出的地址的引用).因此,在编译时不可用的信息,即在 RIP
中的绝对地址,可以在不知道其值的情况下在指令中进行编码.nasm表达式 lea rsi,[rel msg]
被翻译成类似
lea reg,[expression]
uses all this complex way of calculating addresses to load the address itself into reg
(unlike mov
, which dereferences the address calculated). Thus the information, unavailable at compilation time, namely absolute address which would be in RIP
, can be encoded in the instruction without knowing its value. The nasm expression lea rsi,[rel msg]
gets translated into something like
lea rsi,[rip+(msg-nextInsn)]
nextInsn:
使用相对地址 msg-nextInsn
而不是 msg
的绝对地址,从而使汇编器不知道实际地址,但仍对指令进行编码.
which uses the relative address msg-nextInsn
instead of absolute address of msg
, thus allowing the assembler to not know the actual address but still encode the instruction.
这篇关于为什么在NASM中使用相对RIP寻址?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!