汇编-x86调用指令和内存地址? [英] Assembly - x86 call instruction and memory address?

查看:397
本文介绍了汇编-x86调用指令和内存地址?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在阅读一些汇编代码,并且开始看到调用指令实际上是相对于程序计数器的.

但是,每当我使用Visual Studio或windbg进行调试时,总是说呼叫0xFFFFFF ...对我来说,这意味着我要跳转到该地址.

谁是对的? Visual Studio是否隐藏了指令编码的复杂性,只是在说哦,这就是程序的意思,那就是调试器知道它是与pc相关的指令,并且由于它知道pc,所以它就为您进行数学运算了吗? /p>

高度困惑.

解决方案

如果要分解尚未链接的.o对象文件,则调用地址将只是链接器要填充的占位符

您可以使用objdump -drwc -Mintel.o (显示-r选项是键.或者-R用于已链接的共享库.)


对于用户而言,显示跳转目标的实际地址比将其反汇编为jcc eip-1234H或其他内容更为有用.目标文件具有默认的加载地址,因此反汇编程序在每条指令中均具有eip的值,并且该值通常出现在反汇编输出中.

例如在我编写的一些asm代码中(在其中我使用了将其放入目标文件的符号名称,因此反汇编程序实际上可以看到循环分支目标):

objdump -M intel  -d rs-asmbench:
...
00000000004020a0 <.loop>:
  4020a0:       0f b6 c2                movzx  eax,dl
  4020a3:       0f b6 de                movzx  ebx,dh
   ...
  402166:       49 83 c3 10             add    r11,0x10
  40216a:       0f 85 30 ff ff ff       jne    4020a0 <.loop>

0000000000402170 <.last8>:
  402170:       0f b6 c2                movzx  eax,dl

请注意,jne指令的编码是带符号的小尾数32位位移,-0xD0个字节. (跳转后,跳转将其位移添加到e/rip的值中.跳转指令本身为6个字节长,因此位移必须为-0xD0,而不仅仅是-0xCA.)0x100 - 0xD0 = 0x30,即值2的补码位移的最低有效字节的大小.

在您的问题中,您所指的是调用地址为0xFFFF...,除非它仅是一个占位符,否则否则没有什么意义,或者您认为位移中的非0xFF字节是操作码的一部分. /p>

在链接之前,对外部符号的引用如下所示:

objdump -M intel -d main.o
  ...
  a5:   31 f6                   xor    esi,esi
  a7:   e8 00 00 00 00          call   ac <main+0xac>
  ac:   4c 63 e0                movsxd r12,eax
  af:   ba 00 00 00 00          mov    edx,0x0
  b4:   48 89 de                mov    rsi,rbx
  b7:   44 89 f7                mov    edi,r14d
  ba:   e8 00 00 00 00          call   bf <main+0xbf>
  bf:   83 f8 ff                cmp    eax,0xffffffff
  c2:   75 cc                   jne    90 <main+0x90>
  ...

请注意call指令的相对位移=0.因此,在链接器插入实际相对值之前,它们会在调用后立即对带有指令目标的call进行编码. (即RIP = RIP+0). call bf后面紧跟着一条从该节开始的0xbf开始的指令.另一个call具有不同的目标地址,因为它在文件中的其他位置. (gcc将main放在其自己的部分:.text.startup).

因此,如果您想了解实际被调用的内容,请查看链接的可执行文件,或获取一个反汇编程序,该反汇编程序查看目标文件符号以插入调用目标的符号名称,而不是将它们显示为带有零位移.

在链接之前,相对跳转到本地符号已得到解决:

objdump -Mintel  -d asm-pinsrw.o:
0000000000000040 <.loop>:
  40:   0f b6 c2                movzx  eax,dl
  43:   0f b6 de                movzx  ebx,dh
  ...
 106:   49 83 c3 10             add    r11,0x10
 10a:   0f 85 30 ff ff ff       jne    40 <.loop>
0000000000000110 <.last8>:
 110:   0f b6 c2                movzx  eax,dl

请注意,即使文件没有基址,在相对跳转到同一文件中的符号时,也会使用完全相同的指令编码,因此反汇编程序会将其视为零.

有关指令编码,请参阅英特尔参考手册. https://stackoverflow.com/tags/x86/info 上的链接.即使在64位模式下,call也仅支持32位符号扩展的相对偏移量.绝对支持64位地址. (在32位模式下,支持16位相对地址,并带有操作数大小的前缀,我想可以节省一个指令字节.)

I've been reading some assembly code and I've started seeing that call instructions are actually program counter relative.

However, whenever I'm using visual studio or windbg to debug, it always says call 0xFFFFFF ... which to me means it's saying I'm going to jump to that address.

Who is right? Is Visual Studio hiding the complexity of the instruction encoding and just saying oh that's what the program means, that is the debugger know it's a pc-relative instruction, and since it knows the pc, it just goes and does the math for you?

Highly confused.

解决方案

If you're disassembling .o object files that haven't been linked yet, the call address will just be a placeholder to be filled in by the linker.

You can use objdump -drwc -Mintel to show the relocation types + symbol names from a .o (The -r option is the key. Or -R for an already-linked shared library.)


It's more useful to the user to show the actual address of the jump target, rather than disassemble it as jcc eip-1234H or something. Object files have a default load address, so the disassembler has a value for eip at every instruction, and this is usually present in disassembly output.

e.g. in some asm code I wrote (where I use symbol names that made it into the object file, so the loop branch target is actually visible to the disassembler):

objdump -M intel  -d rs-asmbench:
...
00000000004020a0 <.loop>:
  4020a0:       0f b6 c2                movzx  eax,dl
  4020a3:       0f b6 de                movzx  ebx,dh
   ...
  402166:       49 83 c3 10             add    r11,0x10
  40216a:       0f 85 30 ff ff ff       jne    4020a0 <.loop>

0000000000402170 <.last8>:
  402170:       0f b6 c2                movzx  eax,dl

Note that the encoding of the jne instruction is a signed little-endian 32bit displacement, of -0xD0 bytes. (jumps add their displacement to the value of e/rip after the jump. The jump instruction itself is 6 bytes long, so the displacement has to be -0xD0, not just -0xCA.) 0x100 - 0xD0 = 0x30, which is the value of the least-significant byte of the 2's complement displacement.

In your question, you're talking about the call addresses being 0xFFFF..., which makes little sense unless that's just a placeholder, or you thought the non-0xFF bytes in the displacement were part of the opcode.

Before linking, references to external symbols look like this:

objdump -M intel -d main.o
  ...
  a5:   31 f6                   xor    esi,esi
  a7:   e8 00 00 00 00          call   ac <main+0xac>
  ac:   4c 63 e0                movsxd r12,eax
  af:   ba 00 00 00 00          mov    edx,0x0
  b4:   48 89 de                mov    rsi,rbx
  b7:   44 89 f7                mov    edi,r14d
  ba:   e8 00 00 00 00          call   bf <main+0xbf>
  bf:   83 f8 ff                cmp    eax,0xffffffff
  c2:   75 cc                   jne    90 <main+0x90>
  ...

Notice how the call instructions have their relative displacement = 0. So before the linker has slotted in the actual relative value, they encode a call with a target of the instruction right after the call. (i.e. RIP = RIP+0). The call bf is immediately followed by an instruction that starts at 0xbf from the start of the section. The other call has a different target address because it's at a different place in the file. (gcc puts main in its own section: .text.startup).

So, if you want to make sense of what's actually being called, look at a linked executable, or get a disassembler that has looks at the object file symbols to slot in symbolic names for call targets instead of showing them as calls with zero displacement.

Relative jumps to local symbols already get resolved before linking:

objdump -Mintel  -d asm-pinsrw.o:
0000000000000040 <.loop>:
  40:   0f b6 c2                movzx  eax,dl
  43:   0f b6 de                movzx  ebx,dh
  ...
 106:   49 83 c3 10             add    r11,0x10
 10a:   0f 85 30 ff ff ff       jne    40 <.loop>
0000000000000110 <.last8>:
 110:   0f b6 c2                movzx  eax,dl

Note, the exact same instruction encoding on the relative jump to a symbol in the same file, even though the file has no base address, so the disassembler just treats it as zero.

See Intel's reference manual for instruction encoding. Links at https://stackoverflow.com/tags/x86/info. Even in 64bit mode, call only supports 32bit sign-extended relative offsets. 64bit addresses are supported as absolute. (In 32bit mode, 16bit relative addresses are supported, with an operand-size prefix, I guess saving one instruction byte.)

这篇关于汇编-x86调用指令和内存地址?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆