基座与位移之间的差异 [英] Difference betwen Base and Displacement

查看:74
本文介绍了基座与位移之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在理解我遇到的两条指令时,我遇到了一些问题.
第一个如下:

I have some issues understanding two instructions I encountered.
The first one is as follow:

imul   eax,DWORD PTR [esi+ebx*4-0x4]

此指令是否意味着=>将您在括号之间计算的地址处的值乘以eax并将其存储在同一寄存器(eax)中?如果是这样,我们是否像这样计算括号之间的地址?

Does this instruction means => Take the value at the address you calculate between brackets multiply it by eax and store it in that same register(eax)? If so do we calculate the address between brackets like that?

  1. ebx * 4
  2. esi +操作1的结果
  3. 从结果中减去4
  4. 转到地址(结果)并获取其中的值.

我在解码时遇到的第二条指令就是这个

The second instruction I have issue decoding is this one

jmp    DWORD PTR [eax*4+0x80497e8]

-eax * 4等于索引*比例尺吗?
-是0x80497e8的位移吗?

-Is eax *4 equivalent to index * scale ?
-Is 0x80497e8 the displacement ?

所以要在方括号内获取地址,这是我们应该遵循的顺序吗?

So to get the address inside the brackets is this the order we should follow?

  1. eax * 4
  2. 将1中的结果添加到地址0x80497e8
  3. 跳转到该地址

据我了解,[base + index * scales]用于获取内部和数组中的值.基数是指向数组中第一个元素的指针.索引实际上是我们想要的值存储在其中的索引比例是数组包含的日期的大小

In my understanding [base + index * scales] is used to fetch values inside and array. The base is the pointer to the first element in the array. The index is literally the index where the value we want is stored And the scale is the size of the date the array contains

我的问题是,当您在方程式中添加位移时,位移用于什么目的?当位移为负值时,这意味着什么?

My issue is when you add displacement in the equation, what is the displacement used for? And what does it mean when the displacement has a negative value ?

推荐答案

不要被术语欺骗."base"具有特定的技术含义,寻址模式的"base"组件不必是数组的开始.例如 [esp + 16 + esi * 4] 可能正在索引以 esp + 16 开头的本地数组,即使 esp 是基址寄存器

Don't be fooled by the terminology. "base" has a specific technical meaning, and the "base" component of an addressing mode doesn't have to be the start of an array. e.g. [esp + 16 + esi*4] could be indexing a local array that starts at esp+16, even though esp is the base register.

类似地,对 [esi + ebx * 4-0x4] 的最明显的解释是 array [i-1] i 在EBX和 esi 中保存数组的起始地址.对于编译器而言,将 -1 折叠为寻址模式而不是在另一个寄存器中计算 ebx-1 并将其用作索引是一个显而易见的优化.

Similarly, the most obvious interpretation of [esi+ebx*4-0x4] is array[i-1], with i in EBX and esi holding the start address of the array. It's an obvious optimization for the compiler to fold the -1 into the addressing mode instead of computing ebx-1 in another register and using that as the index.

当位移为负值时是什么意思?

And what does it mean when the displacement has a negative value?

它不意味着"任何东西.硬件只是进行二进制加法并使用结果.由程序员(或编译器)决定使用一种寻址模式来访问所需的字节.

It doesn't "mean" anything. The hardware just does binary addition and uses the result. It's up to the programmer (or compiler) to use an addressing mode that accesses the byte you want.

我对>内存内容的回答地点.(x86寻址模式)举例说明了何时可以使用每种可能的寻址模式进行数组索引,并使用指向数组或静态数组的指针(因此您可以将数组的起始地址硬编码为绝对位移)).

My answer on Referencing the contents of a memory location. (x86 addressing modes) has examples of when you might use every possible addressing mode for array indexing, with either a pointer to an array or a static array (so you can hard-code the array start address as an absolute displacement).

在x86技术寻址模式术语中:

In technical x86 addressing mode terminology:

  • 位移:地址的+-恒定部分,以2的补码符号扩展后的 disp8 disp32 编码.(在64位寻址模式下, disp32 被符号扩展为64位.)
  • 偏移量: esi + ebx * 4-0x4 计算的结果:相对于线段基准的偏移量.(在base = 0的普通平面内存模型中,偏移量=整个地址).

  • displacement: the +- constant part of an address, encoded in a 2's complement sign-extended disp8, or a disp32. (In 64-bit addressing modes, the disp32 is sign-extended to 64 bits).
  • offset: the result of the esi+ebx*4-0x4 calculation: the offset relative to the segment base. (In a normal flat memory model with base=0, the offset = the whole address).

人们经常使用偏移"来描述位移,并且通常没有混淆,因为从上下文中很明显,他们在谈论恒定偏移(在x86 seg:off以外的意义上使用英文单词offset)),但我喜欢坚持使用位移"来描述位移.

People often use "offset" to describe the displacement, and usually there's no confusion because it's clear from context they're talking about a constant offset (using the English word offset in a sense other than x86 seg:off), but I like to stick to "displacement" to describe the displacement.

基:寻址模式的非索引寄存器组件(如果有的话).(无基址寄存器"的编码表示有一个 disp32 ,您可以将其视为基数.它表示DS段.)

base: the non-index register component of the addressing mode, if there is one. (The encoding for "no base register" instead means there's a disp32, and you can think of that as a base. It implies the DS segment.)

这包括仅具有索引而没有基址寄存器的情况: [esi * 4] 只能编码为 [dword 0 + esi * 4] .

This includes the case of having only an index and no base register: [esi*4] can only be encoded as [dword 0 + esi*4].

imul   eax,DWORD PTR [esi+ebx*4-0x4]

是, eax * =内存源操作数.

是的,您的地址计算是正确的.底数+缩放索引+带符号的位移,产生一个虚拟地址 1 .

And yes, your address calculation is correct. Base + scaled index + signed displacement, resulting in a virtual address1.

转到地址(结果)并获取其中的值"是一种怪异的描述方式.转到"通常意味着控制传输,将字节作为代码获取.但这不是事实,这只是从该地址加载的数据,完全由硬件处理.

"go to the address (result) and get the value inside it" is a weird way to describe it. "go to" would normally mean a control transfer, fetching the bytes as code. But that's not what happens, this is just a data load from that address, fully handled by hardware.

现代的x86 CPU(例如Intel Skylake)将 imul eax [esi + ebx * 4-4] 解码为两个代码:imul ALU操作和负载.ALU操作必须等待加载结果.(有趣的事实:除了无序调度程序外,对于大多数管道,两个微操作实际上都微融合到单个uop中.请参阅

A modern x86 CPU (like Intel Skylake for example) decodes the imul eax, [esi+ebx*4 - 4] into two uops: an imul ALU operation and a load. The ALU operation has to wait for the load result. (Fun fact: the two micro-ops are actually micro-fused into a single uop for most of the pipeline, except for in the out-of-order scheduler. See https://agner.org/optimize/ for more.)

运行负载uop时,地址生成单元(AGU)获得2个寄存器输入,索引比例因子(左移2)和立即位移( -4 ).AGU中的移位和加法硬件将计算加载地址.

When the load uop runs, the address-generation unit (AGU) gets the 2 register inputs, the index scale factor (left shift by 2), and the immediate displacement (-4). The shift-and-add hardware in the AGU calculates the load address.

加载执行单元内部的下一步是使用该地址从L1d缓存(该缓存具有第一级L1dTLB虚拟->物理缓存基本内置)进行加载.L1d被虚拟索引,因此可以进行TLB查找与从L1d缓存中获取8个标记+数据的集合并行).假设L1dTLB和L1d缓存中有命中,那么加载执行单元会在大约5个周期后收到加载结果.

The next step inside the load execution unit is to use that address to load from L1d cache (which has the first-level L1dTLB virtual->physical cache basically built-in. L1d is virtually indexed, so the TLB lookup can happen in parallel with fetching the set of 8 tags+data from that way of L1d cache). Assuming a hit in the L1dTLB and L1d cache, the load execution unit receives a load result ~5 cycles later.

该加载结果作为源操作数转发到ALU执行单元.ALU不在乎它是 imul eax,ebx 还是内存源操作数.一旦两个输入操作数都准备好,乘法uop就会立即分配到ALU.

That load result is forwarded to the ALU execution unit as a source operand. The ALU doesn't care whether it was imul eax, ebx or a memory source operand; that multiply uop is just dispatched to the ALU as soon as both input operands are ready.

jmp    DWORD PTR [eax*4+0x80497e8]

是的, eax * 4 是缩放的索引.

是, 0x80497e8 是disp32位移.在这种情况下,寻址模式的位移部分可能被用作静态跳转表的地址.您可以将其视为该寻址模式的基础.

Yes, 0x80497e8 is the disp32 displacement. In this case, the displacement component of the addressing mode is probably being used as the address of a static jump table. You can think of it as the base for this addressing mode.

跳转到该地址

否,请从该地址加载新的EIP值.由于方括号,这是内存间接跳转.

Nope, load a new EIP value from that address. It's a memory-indirect jump because of the square brackets.

您所描述的将会是

lea   eax, [eax*4+0x80497e8]       ; address calc
jmp   eax                          ; jump to code at that address

无法在一条指令中进行计算的跳转,您始终需要将新的EIP值存储在寄存器中或从内存中获取数据.

There's no way to do a computed jump in one instruction, you always need the new EIP value to be in a register or fetched as data from memory.

脚注1:我们假设使用平面内存模型(段基= 0),因此我们可以忽略分段,就像在Linux,Windows,OS X或几乎任何32或3264位操作系统.因此,地址计算会为您提供线性地址.

Footnote 1: We're assuming a flat memory model (segment base = 0), so we can ignore segmentation, like normal for code running under a normal OS like Linux, Windows, OS X, or pretty much any 32 or 64-bit OS. So the address calculation gives you a linear address.

我还假设像在主流OS上正常启用分页一样,因此它是一个虚拟地址,必须通过TLB缓存的页表转换为物理地址.

I'm also assuming that paging is enabled, like normal under a mainstream OS, so it's a virtual address that has to be translated to physical, by the page tables cached by the TLB.

这篇关于基座与位移之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆