汇编-x86调用指令和内存地址? [英] Assembly - x86 call instruction and memory address?
问题描述
我一直在阅读一些汇编代码,并且开始看到调用指令实际上是相对于程序计数器的.
但是,每当我使用Visual Studio或windbg进行调试时,总是说呼叫0xFFFFFF ...对我来说,这意味着我要跳转到该地址.
谁是对的? Visual Studio是否隐藏了指令编码的复杂性,只是在说哦,这就是程序的意思,那就是调试器知道它是与pc相关的指令,并且由于它知道pc,所以它就为您进行数学运算了吗? /p>
高度困惑.
如果要分解尚未链接的.o
对象文件,则调用地址将只是链接器要填充的占位符
您可以使用objdump -drwc -Mintel
从.o
(显示-r
选项是键.或者-R
用于已链接的共享库.)
对于用户而言,显示跳转目标的实际地址比将其反汇编为jcc eip-1234H
或其他内容更为有用.目标文件具有默认的加载地址,因此反汇编程序在每条指令中均具有eip
的值,并且该值通常出现在反汇编输出中.
例如在我编写的一些asm代码中(在其中我使用了将其放入目标文件的符号名称,因此反汇编程序实际上可以看到循环分支目标):
objdump -M intel -d rs-asmbench:
...
00000000004020a0 <.loop>:
4020a0: 0f b6 c2 movzx eax,dl
4020a3: 0f b6 de movzx ebx,dh
...
402166: 49 83 c3 10 add r11,0x10
40216a: 0f 85 30 ff ff ff jne 4020a0 <.loop>
0000000000402170 <.last8>:
402170: 0f b6 c2 movzx eax,dl
请注意,jne
指令的编码是带符号的小尾数32位位移,-0xD0
个字节. (跳转后,跳转将其位移添加到e/rip
的值中.跳转指令本身为6个字节长,因此位移必须为-0xD0
,而不仅仅是-0xCA
.)0x100 - 0xD0 = 0x30
,即值2的补码位移的最低有效字节的大小.
在您的问题中,您所指的是调用地址为0xFFFF...
,除非它仅是一个占位符,否则否则没有什么意义,或者您认为位移中的非0xFF
字节是操作码的一部分. /p>
在链接之前,对外部符号的引用如下所示:
objdump -M intel -d main.o
...
a5: 31 f6 xor esi,esi
a7: e8 00 00 00 00 call ac <main+0xac>
ac: 4c 63 e0 movsxd r12,eax
af: ba 00 00 00 00 mov edx,0x0
b4: 48 89 de mov rsi,rbx
b7: 44 89 f7 mov edi,r14d
ba: e8 00 00 00 00 call bf <main+0xbf>
bf: 83 f8 ff cmp eax,0xffffffff
c2: 75 cc jne 90 <main+0x90>
...
请注意call
指令的相对位移=0.因此,在链接器插入实际相对值之前,它们会在调用后立即对带有指令目标的call
进行编码. (即RIP = RIP+0
). call bf
后面紧跟着一条从该节开始的0xbf
开始的指令.另一个call
具有不同的目标地址,因为它在文件中的其他位置. (gcc将main
放在其自己的部分:.text.startup
).
因此,如果您想了解实际被调用的内容,请查看链接的可执行文件,或获取一个反汇编程序,该反汇编程序查看目标文件符号以插入调用目标的符号名称,而不是将它们显示为带有零位移.
在链接之前,相对跳转到本地符号已得到解决:
objdump -Mintel -d asm-pinsrw.o:
0000000000000040 <.loop>:
40: 0f b6 c2 movzx eax,dl
43: 0f b6 de movzx ebx,dh
...
106: 49 83 c3 10 add r11,0x10
10a: 0f 85 30 ff ff ff jne 40 <.loop>
0000000000000110 <.last8>:
110: 0f b6 c2 movzx eax,dl
请注意,即使文件没有基址,在相对跳转到同一文件中的符号时,也会使用完全相同的指令编码,因此反汇编程序会将其视为零.
有关指令编码,请参阅英特尔参考手册. https://stackoverflow.com/tags/x86/info 上的链接.即使在64位模式下,call
也仅支持32位符号扩展的相对偏移量.绝对支持64位地址. (在32位模式下,支持16位相对地址,并带有操作数大小的前缀,我想可以节省一个指令字节.)
I've been reading some assembly code and I've started seeing that call instructions are actually program counter relative.
However, whenever I'm using visual studio or windbg to debug, it always says call 0xFFFFFF ... which to me means it's saying I'm going to jump to that address.
Who is right? Is Visual Studio hiding the complexity of the instruction encoding and just saying oh that's what the program means, that is the debugger know it's a pc-relative instruction, and since it knows the pc, it just goes and does the math for you?
Highly confused.
If you're disassembling .o
object files that haven't been linked yet, the call address will just be a placeholder to be filled in by the linker.
You can use objdump -drwc -Mintel
to show the relocation types + symbol names from a .o
(The -r
option is the key. Or -R
for an already-linked shared library.)
It's more useful to the user to show the actual address of the jump target, rather than disassemble it as jcc eip-1234H
or something. Object files have a default load address, so the disassembler has a value for eip
at every instruction, and this is usually present in disassembly output.
e.g. in some asm code I wrote (where I use symbol names that made it into the object file, so the loop branch target is actually visible to the disassembler):
objdump -M intel -d rs-asmbench:
...
00000000004020a0 <.loop>:
4020a0: 0f b6 c2 movzx eax,dl
4020a3: 0f b6 de movzx ebx,dh
...
402166: 49 83 c3 10 add r11,0x10
40216a: 0f 85 30 ff ff ff jne 4020a0 <.loop>
0000000000402170 <.last8>:
402170: 0f b6 c2 movzx eax,dl
Note that the encoding of the jne
instruction is a signed little-endian 32bit displacement, of -0xD0
bytes. (jumps add their displacement to the value of e/rip
after the jump. The jump instruction itself is 6 bytes long, so the displacement has to be -0xD0
, not just -0xCA
.) 0x100 - 0xD0 = 0x30
, which is the value of the least-significant byte of the 2's complement displacement.
In your question, you're talking about the call addresses being 0xFFFF...
, which makes little sense unless that's just a placeholder, or you thought the non-0xFF
bytes in the displacement were part of the opcode.
Before linking, references to external symbols look like this:
objdump -M intel -d main.o
...
a5: 31 f6 xor esi,esi
a7: e8 00 00 00 00 call ac <main+0xac>
ac: 4c 63 e0 movsxd r12,eax
af: ba 00 00 00 00 mov edx,0x0
b4: 48 89 de mov rsi,rbx
b7: 44 89 f7 mov edi,r14d
ba: e8 00 00 00 00 call bf <main+0xbf>
bf: 83 f8 ff cmp eax,0xffffffff
c2: 75 cc jne 90 <main+0x90>
...
Notice how the call
instructions have their relative displacement = 0. So before the linker has slotted in the actual relative value, they encode a call
with a target of the instruction right after the call. (i.e. RIP = RIP+0
). The call bf
is immediately followed by an instruction that starts at 0xbf
from the start of the section. The other call
has a different target address because it's at a different place in the file. (gcc puts main
in its own section: .text.startup
).
So, if you want to make sense of what's actually being called, look at a linked executable, or get a disassembler that has looks at the object file symbols to slot in symbolic names for call targets instead of showing them as calls with zero displacement.
Relative jumps to local symbols already get resolved before linking:
objdump -Mintel -d asm-pinsrw.o:
0000000000000040 <.loop>:
40: 0f b6 c2 movzx eax,dl
43: 0f b6 de movzx ebx,dh
...
106: 49 83 c3 10 add r11,0x10
10a: 0f 85 30 ff ff ff jne 40 <.loop>
0000000000000110 <.last8>:
110: 0f b6 c2 movzx eax,dl
Note, the exact same instruction encoding on the relative jump to a symbol in the same file, even though the file has no base address, so the disassembler just treats it as zero.
See Intel's reference manual for instruction encoding. Links at https://stackoverflow.com/tags/x86/info. Even in 64bit mode, call
only supports 32bit sign-extended relative offsets. 64bit addresses are supported as absolute. (In 32bit mode, 16bit relative addresses are supported, with an operand-size prefix, I guess saving one instruction byte.)
这篇关于汇编-x86调用指令和内存地址?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!